Python Data Science
Python Data Science, a placement oriented program, keeping in view of career focus on data scientist or a data engineer. The prime objective is, to make the candidate suitable for handling complex data set over analyzing them through various important tools and make them actionable. Interested in some other courses too?
The python data science supports 6 months internship to make career further in data science. Bundled of Five (05) major modules:
Topics Covered under Python Data Science Course
Module 1: SAS – BASE
INTRODUCTION TO SAS
- An Overview of the SAS System, SAS Tasks, Output Produced by the System, SAS Tools(SAS Program – Data step and Proc step), A Sample SAS Program, Explore SAS Windowing Environment Navigation
DATA ACCESS & DATA MANAGEMENT
- SAS Data Libraries, Rules for Writing SAS Programs / Statements, Datasets & Variable Name, Getting Familiar with SAS Dataset, Data Portion of the SAS Dataset, Attributes of a variable (Numeric / Character), System Options, Dataset Options, Flow of Data Step Processing – Compilation & Execution Phase, Input Buffer, Program Data Vector (PDV), Descriptor Information of a SAS Dataset
DATA TRANSFORMATIONS
- SAS Data Values, Length Statement, Creating Multiple output SAS datasets for singe input SAS datasets, Conditionally writing observation to one or more datasets, Output Multiple Observation (Implicit Output), Selecting Variables and observations (DROP or KEEP Statement and DROP= or KEEP= Datasets options), Controlling which Observations are read (OBS= FIRSTOBS = Options), The Data Statement_Null_, The_N_Automatic Variable, Creating Subset of Observations:- Conditional Processing using IF-THEN and ELSE Statement, IF—THEN DO: —-END,ELSE DO:—-END, DO WHILE Statement, DO UNTIL Statement, Iterative DO loop Processing, Where statement OR Where condition (dataset), Deciding whether to use a here statement or Sub setting IF statement, Accumulating Totals for a Group of Data ( BY- Group Processing ( First & Last), Multiple BY Variables, DATASETS Procedure, Reading SAS datasets and Creating Variables, Creating an Accumulating Variables (The RETAIN Statement), The DELETE Statement, The SUM Statement, the RENAME = Data Set option, Combining SAS Datasets, Concatenating SAS Data Sets Using SET statement in DATA Step, Inter Leaving SAS Data Sets, merging SAS Data Sets:= Match-Merge, Using Merge Statement, THE IN = Data Set option, Additional Features of Merging SAS datasets One to Many Merging, Many to Many Merging
READING RAW DATA FROM EXTERNAL FILE
- Introduction to Raw data, Factors considered to Examine the Raw data, Reading Unaligned Data (List Input), Reading Data aligned to Columns (Columns Input), Reading Data that requires Special Instructions (Formatted Input), Controlling the Position of the pointer in Formatted Input:- Absolute – Column Pointer Control, Relative Column Pointer Control, Mixed Style Input (Mixing List, Input, Formatted Input Style in one INPUT Statement, Using Colon (:) modifier to specific an informant in the INPUT Statement, Recognize delimiter in the raw data file (Using DLM = option in INFILE Statement), Missing data at the end of row (Using MISSOVER option in INFILE statement), Reading a raw data file with multiple records er observation (Column pointer controls) :- Using Multiple INPUT statement, Using Line Pointer Control (/), Reading variable from multiple records in any order (#n), Line Hold Specifies in INPUT statement :- the Single Trailing @, The Double Trailing @@ (Multiple Observations per Record), Methods of Control in INFILE Statement:- FLOWOVER, STOPOVER, MISSOVER, TURNOVER, Writing to an external File (FILE & PUT Statement), Reading Excel Spreadsheets (IMPORT Wizard / IMPORT Procedure)
SAS FUNCTIONS
- Manipulating Character values ( SUB STRING / RIGHT/ LEFT/ SCAN/ CONCATENATION TRIM / FIND / INDEX / UP CASE / LOW CASE / COMPRESS / LENGTH), Manipulating Numeric Values (ROUND / CEIL / FLOOR / INT / SUM / MEAN / MIN / MAX), Manipulating Numeric Values based on DATES (MDY / TODAY / INTCK / YRDIF), Converting Variable Types :- INPUT (Character-to-Numeric), PUT (Numeric-to-Character), Debugging SAS program (DEBUG Option), SAS VARIABLES Lists, SAS Arrays, Enhancing Report Output :- Defining Titles & Footnotes, Formatting Data value (Date, Character & Numeric values), Creating User-Defined Formats (Proc Contents), Formats & Infomats
ANALYSIS & PRESENTATION
- Descriptor portion of the SAS Data Set (Proc Contents), Producing List Reports (Proc.Print) Sequencing and Grouping Observations (Proc Sort), Producing Summary Reports:- PROC FREQ – ( one Way & Two-Way Frequencies, PROC MEANS, PROC REPORT, PROC TABULATE, PROC SUMMARY, PROC PRINTO, PROC APPEND, PROC TRANSPOSE, PROC COPY, PROC COMPARE, PROC DATASETS, Regression Procedure, Analysis-of-Variance Procedures, Univariate / multivariate Procedures, ranking Procedure, Producing bard and Pie Charts, producing Plots, The Output Delivery System (SAS/ODS);- Creating HTML Reports, Creating text Reports, Creating PDF Reports, Creating CSV Files
SAS MACRO LANGUAGE INTRODUCTION TO THE MACRO FACILITY
- Purpose of the Macro Facility, Generate SAS code using Macros (% Macro & % Mend), Writing Macro-Based Programs, Replacing Text Strings using Macros Variable (% Let)
MACRO PROGRAMS
- Defining a Macro ( %Macro & % Mend), macro Compilation, Monitoring Macro Compilation (M COMPILE NOTE OPTION), Calling a Macro (%Macro-Name), Macro Execution, Monitoring Macro Execution (M LOGIC OPTION), Viewing the generates SAS Code in the Log form Macro Program (M PRINT OPTION), Macro Storage, macro Parameters: Macro parameters Lists, Macro with Positional Parameters, Macro with Keyword Parameters, Arithmetic and logical Operations, Conditional Processing:- % IF expression % THEN text; %ELSE: %DO;, Stored Complied Macros, % INCLUDE Statement, Macro Processing: Tokens, Macro Triggers, How to Macro processor works, Macro variable Concepts:- Referencing a Macro Variable, Displaying Macro Variable Valur\e in the SAS log (SYSBOLGEN OPTION), Automatic Macro Variables:- System-Defined Macro Variables (_AUTOMATIC_), User-Defined Macro Variable (_USER_), %LET Statement, Global Macro Variables, Lacal Macro Variables, Deleting User-Defined Macro Variable (%SYMDEL). Macro Functions:- Character Strings, Other SAS Functions :- %SSFUNC, %STR, Combining Macro Variable References with Text, macro Variable Name Delimiter, Quoting, Creating Macro variables in the Data Step (CALL SYMPUT ROUTINE), Obtaining Variable value during Macro Execution (SYMGET FUNCTION), Creating Macro Variables during PROC SQL Execution (INTO Clause), Creating a delimited list of Values.
SAS SQL PROCESSING
- Introduction to the SQL Procedure, terminology, Features of PROC SQL, PROC SQL Syntax(SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY), VALIDATE Keyword, NOEXEC Option, Added PROC SQL Statements (ALTER, CREATE, DELETE, DESCRIBE, DROP), FEEDBACK OPTION, PROC SQL and DATA Step Comparisons
QUERIES
- Retrieving Data from a Table, Identify All Row in a table, Remove Duplicate Rows, Sub Setting using WHERE clause, Sub setting with Calculated Values, Ordering Data, Enhancing Query Output (LABEL, FORMAT), Grouping Data (Group By), Analyzing Groups of Data (COUNT), Updating Data Values (Update Statement), using table Alias, Creating Views, Creating Dropping Indexes, Sub Queries:- Non-Correlated Sub Query, Correlated Sub Query, Combining Tables:- Joins, Inner Joins, Outer Joins, Left Join, Right Join, Full Join, Set Operators:- EXCEPT, INTERSECT, UNION, Choosing between Data Step Merges and SQL Joins
OVERVIEW OF METHODS FOR COMBINING SAS DATA SETS
- DEFINITIONS, CONCATENATING, INTERLEAVING, ONE-TO- ONE READING OR ONE-TO-ONE MERGING, MATCH-MERGING, UPDATING, MODIFYING, DEFINITIONS FOR READING, COMBINING, AND MODIFYING SAS DATA SETS, READING A SAS DATA SET, COMBINING SAS DATA SETS, MODIFYING SAS DATA SETS, OVERVIEW OF TOOLS, READING SAS DATA SETS, READING A SINGLE SAS DATA SET, READING FROM MULTIPLE SAS DATA SETS, COMBINING SAS DATA SETS: BASIC CONCEPTS, ONE-TO-ONE, ONE-TO-MANY AND MANY-TO-ONE, MANY-TO-MANY, ACCESS METHODS: SEQUENTIAL VERSUS DIRECT, SEQUENTIAL ACCESS, DIRECT ACCESS, ONE-TO-ONE READING, DATA STEP PROCESSING DURING A ONE-TO-ONE READING, ONE-TO-ONE MERGING, MATCH-MERGING, UPDATING WITH THE UPDATE AND THE MODIFY STATEMENTS, DEFINITIONS, SYNTAX OF THE UPDATE STATEMENT, SYNTAX OF THE MODIFY STATEMENT, UPDATING WITH NON MATCHED OBSERVATIONS, MISSING VALUES, AND NEW VARIABLES, USING AN INDEX WITH THE MODIFY STATEMENT
SAS PROCEDURES
- INTRODUCTION, THE ANATOMY OF A PROC, THE PROC STATEMENT, TITLE AND FOOTNOTE STATEMENTS, BY STATEMENT, LABEL STATEMENT, FORMAT STATEMENT, RUN OR QUIT STATEMENT, DESCRIPTION OF DATA USED IN REPORTS, SAS REPORTING PROCEDURES, PROCS FOR ALL THAT DETAIL, USING PROC PRINT, USING PROC SQL, PROC REPORT, PROCS THAT SUMMARIZE, USING PROC CHART, USING PROC FREQ, USING PROC MEANS, USING PROC UNIVARIATE, INTRODUCTION TO PROC TABULATE, DATA MANIPULATION AND MANAGEMENT PROCEDURE, PROC SORT, PROC DATASETS, PROC FORMAT, PROC CONTENTS, OTHER IMPORTANT PROCS, PROC TRANSPOSE, DEFINITIONS, PROC PRINTTO, COMPARE PROCEDURE, PROC APPEND, HOW TO IMPORT AN EXCEL FILE INTO SAS
INTRODUCTION TO PROC SQL
- INTRODUCTION, WHY LEARN PROC SQL?, SELECT STATEMENT, THE SELECT STATEMENT SYNTAX, A SIMPLE PROC SQL, A COMPLEX PROC SQL, LIMITING INFORMATION ON THE SELECT, CREATING NEW VARIABLES, THE CALCULATED OPTION ON THE SELECT, USING LABELS AND FORMATS, THE CASE EXPRESSION ON THE SELECT, ADDITIONAL SELECT STATEMENT CLAUSES, EMERGING, REEMERGING FOR TOTALS, CALCULATING PERCENTAGE, SORTING THE DATA IN PROC SQL, SORT ON NEW COLUMN, SUBSETTING USING THE WHERE, INCORRECT WHERE CLAUSE, WHERE ON COMPUTED COLUMN, SELECTION ON GROUP COLUMN, USE HAVING CLAUSE, CREATING NEW TABLES, JOINING DATASETS USING PROC SQL, INNER JOIN, JOINING THREE OR MORE TABLES, OUTER JOINS, INCLUDING NONMATCHING ROWS WITH THE RIGHT OUTER JOIN, SELECTING ALL ROWS WITH THE FULL OUTER JOIN, CONCATENATING QUERY RESULTS
AN INTRODUCTION TO SAS MACROS
- INTRODUCTION, SAS MACRO OVERVIEW, TRADITIONAL SAS PROGRAMMING, THE SAS MACRO LANGUAGE, MACRO LANGUAGE COMPONENTS, MACRO VARIABLES, MACRO STATEMENTS, MACRO PROCESSOR FLOW, AUTOMATIC MACRO VARIABLES, MACRO DEBUGGING OPTIONS, WHAT IS A MACRO?, DEFINING AND USING MACROS, POSITIONAL MACRO PARAMETERS, KEYWORD MACRO PARAMETERS, CONDITIONAL MACRO COMPILATION, THE %DO STATEMENT, SAS DATA STEP INTERFACES
THE OUTPUT DELIVERY SYSTEM (ODS)
- INTRODUCTION, CREATING VARIOUS TYPES OF REPORTS LISTING OUTPUT, OTHER DESTINATIONS, HTML, PDF AND POSTSCRIPT, RTF FILES, MICROSOFT EXCEL, ADDING STYLE TO YOUR REPORTS, LOCATE EXISTING STYLES, ODS STYLE= OPTION, CUSTOMIZE YOUR REPORTS, ODS SELECT; AND ODS EXCLUDE, OTHER CUSTOMIZATIONS, ODS , PROCLABEL=, ODS PROCTITLE; AND ODS NOPROCTITLE, ADVANCED TECHNIQUES, ODS DOCUMENT, PROC TEMPLATE
INTRODUCTION TO DIAGNOSING AND AVOIDING ERRORS
- INTRODUCTION, UNDERSTANDING HOW THE SAS SUPERVISOR CHECKS A JOB, UNDERSTANDING HOW SAS PROCESSES ERRORS, DISTINGUISHING TYPES OF ERRORS .SAS RECOGNIZES FOUR KINDS OF ERRORS, SYNTAX ERRORS, EXECUTION-TIME ERRORS, DATA ERRORS, SEMANTIC ERRORS, DIAGNOSING ERRORS, DIAGNOSING SYNTAX ERRORS, DIAGNOSING DATA ERRORS, USING A QUALITY CONTROL CHECKLIST
ADVANCE TOPICS IN SAS
- PERFORMING ADVANCED QUERIES USING PROC SQL, INTRODUCING MACRO VARIABLES, CREATING AND USING MACRO PROGRAMS, STORING MACRO PROGRAMS, CREATING SAMPLES AND INDEXES, USING LOOKUP TABLES TO MATCH DATA, MODIFYING SAS DATA SETS AND TRACKING CHANGES, INTRODUCTION TO EFFICIENT SAS PROGRAMMING
Module 2: Tableau
- Introduction to Tableau and an overview of the different versions.
- Installing Tableau Desktop
- Tableau Help and online resources
- Working with Tableau
- Deep diving with data and connections
- Grouping, creating sets and calculations using parameters
- Creating charts
- Adding calculations to your workbook
- Mapping data in tableau
- Dashboard and stories
- Visualizations for an Audience
Module 3: SQL (PL / SQL Management)
Starting with SQL Server
- Setting up SQL Server, Selecting installation options, Installing a named instance
Leveraging essential tools
- SQL Server Management Studio, Configuration Manager, Transact-SQL, sqlcmd, PowerShell, Dedicated Administrator Connection
Constructing and Managing Databases
- Inspecting storage structures, Relating servers, databases and files, Creating databases and transaction logs,
Designing file groups
- Maximizing storage utilization, Placing tables on file groups ,
Upgrading and moving databases
- Choosing between upgrade and migration, Detaching and attaching databases
Controlling database space
- Permitting automatic database growth, Adding database files to expand databases
Handling Server and Database Security
- Implementing server security, Comparing authentication modes, Defining logins, Creating user-defined server roles, Enforcing password policy
Granting database access
- Contrasting users and logins, Adding users, Defining new roles, Delegating privileges with predefined roles, Repairing mismapped logins
Granting and Revoking Permissions
- Managing database-scoped privileges, Permitting object creation, Giving blanket permissions
Defining object-level permissions
- Limiting object access, Meeting complex permission requirements with roles
Backup and Recovery
- Backing up databases, Selecting a recovery model, Investigating the transaction log, Running full, log and differential backups
Restoring databases
- Performing a post-crash log backup, Rebuilding the master database, Recovering user and system databases
- Identifying blocked processes, Killing blockers
Module 4 & 5: Python (core and Pandas), Numpy, SciPy & Matplotlib
- History and current use
- String Literals and numeric objects
- Collections (lists, tuples, dicts)
- Datetime classes in Python
- Memory Management in Python
- Control Flow
- Functions
- Exception Handling
Defining actionable, analytic questions
- Defining the quantitative construct to make inference on the question
- Identifying the data needed to support the constructs
- Identifying limitations to the data and analytic approach
- Constructing Sensitivity analyses
Bringing Data In
- Structured Data
- Working with Unstructured Text Data
NumPy: Matrix Language
- Introduction to the ndarray
- NumPy operations
- Broadcasting
- Missing data in NumPy (masked array)
- NumPy Structured arrays
- Random number generation
Data Preparation with Pandas
- Filtering
- Creating and deleting variables
- Discretization of Continuous Data
- Scaling and standardizing data
- Identifying Duplicates
- Dummy Coding
- Combining Datasets
- Transposing Data
- Long to wide and back
Exploratory Data Analysis with Pandas
- Univariate Statistical Summaries and Detecting Outliers
- Multivariate Statistical Summaries and Outlier Detection
- Group-wise calculations using Pandas
- Pivot Tables
Exploring Data graphically
- Histogram
- Box-and-whiskers plot
- Scatter plots
- Forest Plots
- Group-by plotting
Advanced Graphing with Matplotlib, Pandas, and Seaborn Python, Hadoop and Spark
- Introduction to the difference in Python, Hadoop, and Spark
- Importing data from Spark and Hadoop to Python
- Parallel execution leveraging Spark or Hadoop
Missing Data
- Exploring and understanding patterns in missing data
- Missing at Random
- Missing Not at Random
- Missing Completely at Random
- Data imputation methods
Traditional Inferential Statistics
- Comparing Groups
- Correlation
Frequentist Approaches to Multivariate Statistics
- Linear Regression
- Logistic regression
Machine learning approaches to multivariate statistics
- Machine Learning Theory
- Data pre-processing
- Supervised Versus Unsupervised Learning
- Unsupervised Learning: Clustering
- Dimensionality Reduction
Supervised Learning: Regression
- Linear Regression
- Penalized Linear Regression
- Stochastic Gradient Descent
- Scoring New Data Sets
- Cross Validation
- Variance Bias-Tradeoff
- Feature Importance
Supervised Learning: Classification
- Logistic Regression
- LASSO
- Random Forest
- Ensemble Methods
- Feature Importance
- Scoring New Data Sets
- Cross Validation
Python Data Science is for
Designed for anyone who wants to set career into data science.
Pre-requisites for Python Data Science
- Basic Knowledge on statistics
- Exposure on Excel and Excel based Analysis
- Some programming background preferably C/C++
- Strong concepts of Data Visualisation
What You Need To Bring
- Laptops / Notebooks
- Writing pad, Pen
On Completion of Python Data Science, you should be able to
- Python with Data Science
- SAS – Base Programme
- Tableau BI tools intelligence &
- SQL Data Management
- Free internship with a reputed organisation in data science
- Internationally recognized certificates
About Trainer
- Group of Distinct Experts from the data science
- Data Scientists with more than 25+ years of experience