Introduction to R Programming for Data Science and Machine Learning
Overview
Data science is an exciting discipline, which leverages Machine Learning and Artificial Intelligence to enable decision makers to turn raw data into understanding, insight and actionable options. With the enormous volume and variety of data being created and collected daily, Data Science is one of today’s fastest-growing and critically important fields for businesses, organizations and government. Data Scientists are in demand by both industry and the public sector with robust job growth expected well into the next decade.
Dates: Coming Soon! Spring 2021
Time: 6:00 pm – 8:30 pm / Monday & Wednesday evenings (12 weeks / remote – live online class)
Course #: CE-COMP 2239 #21769
Cost: $2,450.00
To register for this class, call 914-606-6830 and choose option 1 when prompted.
For more information or questions, please contact:
Jim Irvine, Director, Corporate and Continuing Professional Education
Phone: 914-606-6658 / email: james.irvine@sunywcc.edu
Professional Development Center / Gateway Center 75 Grasslands Road / Valhalla, NY 10595
Visit Us @ www.sunywcc.edu/pdc
Course Description:
This hands on, project-based course is an introduction to the R and Python programming languages as well as to higher-level Microsoft Excel features. Students will learn the fundamentals of problem solving and algorithms, as well as how to use the leading development environments for tackling Data Science challenges and building real world applications with Machine Learning capabilities. This course will provide a strong foundational knowledge to advance on to our Data Science and Artificial Intelligence Practitioner programs from our IBM Skills Academy programs.
Objective:
This course is for who are interested in learning more about data science and / or pursuing a career in data science and machine learning. Learners will gain foundational knowledge of the technology capabilities and insight into the benefits and power of Data Science and Machine Learning.
Additionally, this course provides a pathway and prerequisite to our IBM Skills Academy Data Science Practitioner program. Participants will learn ways to develop a competitive edge and how the right metrics can help to achieve strategic business goals.
Prerequisites:
A solid foundational knowledge in MS Excel.
Target Audience:
Information Architects, Data Analysts, Statisticians, Developers, Business Intelligence professionals, Business Analysts, Big Data specialists, Coders, Web Developers, learners interested in Predictive Analytics and anyone looking to expand their skills and / or advance their career by learning these valuable and in demand knowledge areas.
Course Outline:
Week 1 : Intro to R Programming
- Getting Started with R and making sure everyone has R and R Studio installed
- R Basic Features Overview: What is R, Why learn it, why use it?
- R Syntax
- Running your first Program: Hello World
- Variables and Data Types (Strings and Numbers)
- String Concatenation
- Math Operators (+, -, *, /, ^)
- R Studio Preferences & Workspace
Week 2 : Conditional Logic & Loops
- If-Else Logic: Decisions, decisions…
- Thrown for a Loop: Executing a code block over and over again
Week 3 : Functions
- Functions: Blocks of Code that run when invoked (called) by name
- Pre-fab Functions: Working with Packages (Pre-written code)
- Installation of Packages: Initialization of Fun
- Initialization of Package: Functions
Week 4 : Data Handling
- Vectors
- Lists
- Functions (Methods) for Data Handling
- Objects
- Matrix Operations
- Multi-Dimensional Data
- Data frame
- Categorical data
Week 5: MS Excel Data Analysis Functions You Need to Know
- Working with Formulas
- Charts and Graphs
- Concatenate
- V Lookup
- Exporting Data Files as CSV
- Text Functions
- Removing Duplicates
- Advanced Formulas
Week 6 : Working with Imported Data
- Importing JSON (JavaScript Object Notation) syntax and files
- Importing CSV (Comma Separated Value) files via Excel
- Manipulating and Validating Imported Data
- Applying Functions from Packages to Imported Data
Week 7: Data Visualization
- Why Data Visualization? Makes complex data digestible.
- Graphics introduction
- R Graphs
- GOplo R package for enhanced graphical representation
Week 8 : Data Pre-Processing
- Statistical Functions
- Missing Value Analysis
- Outlier Detection
Week 9 : Machine Learning Overview
- Machine Learning (ML) Sneak Preview: You’ll learn more about this in next course
- Simple Linear Regression
Week 10: Machine Learning Overview, Part 2
- Multiple Linear Regression
- Logistic Regression (Classification)
Week 11: Introduction to Python
- Overview of Python Syntax: A Study in Simplicity
- Variables
- Conditional Logic
- Loops
- Functions
Week 12: Overview of Python for Machine Learning (ML)
- Python Machine Learning Libraries
- Course Wrap Up (Review / Q&A)
FAQs
What is Data Science?
Data science is an exciting discipline that allows you to turn raw data into understanding, insight, and knowledge. It uses analytics and machine learning to help users make predictions, enhance optimization, and improve operations and decision-making. The goal of “R Programming for Data Science” is to help you learn the most important tools in R that will allow you to do data science. As you progress through this course, you’ll learn how to approach a variety of data science challenges, using the best parts of R.
Why is Data Science Important?
Data is one of the important assets in every organization because it helps business leaders make decisions based on facts, statistical numbers and trends The importance of data science is based on the ability to take existing data that is not necessarily useful on its own and combine it with other data points to generate insights an organization can use to learn more about its customers and audience.
Today’s data science teams are expected to answer many questions. Business demands better prediction and optimization based on real-time insights
With the volume and variety of social, mobile and device data, along with new technologies and tools, data science today plays a broader role than ever before. Business considers data science and AI to be a technology-enabled strategy.
Are there jobs available in Data Science?
The short answer is yes. Data science is one of the fastest growing fields today and is expected to continue into the next decade. As most of the fields are emerging continuously, the importance of data science is increasing rapidly. Data science has influenced various areas. Its effect can be observed in multiple sectors such as the retail industry, healthcare, government, financial and education.
It has become an important part of almost every sector. It provides the best solutions that help to fulfill the challenges of the ever-increasing demand and maintainable future. As the importance of data science is increasing day by day, the need for a data scientist is also growing. If you have the skills, there are jobs available not to mention those currently in technical careers (e.g. programming) climbing the career ladder with additional skills such as a data science practitioner.
What about non-technical or leadership roles in Data Science?
As the growth of data accelerates, so does the importance of data science and the teams of data scientists formed to turn this data into useful information, insight and knowledge. While companies prepare for big data integration, business leaders need to adapt their roles as team leaders for their data science employees. Your data science team should have the expertise to process data with freedom, but business leaders still need to understand the basic structures of what’s happening to create value from that data.
Why is this important for you or your organization? A New Era of Business Leader
Put into context in today’s business environment, there’s no situation where it’s okay to say as the leader, I don’t know what’s going on but my team does and that’s good enough. Yet many business leaders don’t know the most basic principles of data science. Business leaders (managers, directors, executives, vice presidents, etc.) don’t need to know the intimate details of data science processes but as the line between big data and business operations disappear, it’s more important than ever for business leaders to speak (understand) a little data science. This translates into to having some basic foundational knowledge.
Why it’s important to understand the basics:
Data science can be good storytelling but it is still science. Telling a story can often obscure the facts or make links where there aren’t any. Having the foundational knowledge or basic proficiency can help you avoid:
- Getting taken - manipulating the data, not telling the whole story, targeted information gaps, all this things could make it easier to coerce or persuade you into a bad decision
- Asking the wrong questions – data pulls are only as good as the questions you’re asking. Data must be evaluated regularly and that requires starting with the right question(s).
- Replicating bias – data is neutral, but it’s aggregation and results are often the product of our preconceived ideas. Understanding the basics of data science helps you sort our the messiness of data in the real world.