About the Course
Weekly Design
Week | Date | Pre-class | Class | PBL | Note |
---|---|---|---|---|---|
1 | 03/04/2025 | Course intro | |||
2 | 03/11/2025 | Variable & Vector | Basic Syntax (1) | ||
3 | 03/18/2025 | Array | Basic Syntax (2) | ||
4 | 03/25/2025 | Data.frame & List | Basic Syntax (3) | Problem description | |
5 | 04/01/2025 | Data import, export, filter | Data Manipulation | Data introduction | |
6 | 04/08/2025 | Repetition, Function | Data Exploration (1) (Recorded Lecture) |
Team arrangement | |
7 | 04/15/2025 | Missing values, Outliers | Data Exploration (2) | Team meeting #1 | |
8 | 04/22/2025 | Data viz intro | Data Visualization (1) | Team meeting #2 | |
9 | 04/29/2025 | Data viz Practice | Data Visualization (2) | Team meeting #3 | |
10 | 05/06/2025 | <No class> | P.Hday | ||
11 | 05/13/2025 | QZ | |||
12 | 05/20/2025 | Shiny Intro | Interactive Web: Shiny | Team meeting #4 | |
13 | 05/27/2025 | Git, GitHub Basic | Version Control and Collaboration | Team meeting #5 | |
14 | 06/03/2025 | Quarto Intro | Reproducible Research | Team meeting #6 | |
15 | 06/10/2025 | <Team meetings> | <Team meetings> | ||
16 | 06/17/2025 | Wrap-up | Project Presentation | Youtube link submission |
Syllabus
Week 1: Introduction to Data Science and R
Course Orientation
Introduction to R and RStudio
What is Data Science?
Week 2: Basic Syntax (1)
R syntax and basic operations
Data types and structures in R
Week 3: Basic Syntax (2)
- Data types and structures in R: Array
Week 4: Basic Syntax (3)
- Data types and structures in R: Data.frame & List
Week 5: Data Manipulation
Data import & export
Data filtering
Week 6: Data Exploration (1): Recorded lecture
Repetition
Function
Introduction to
tidyverse
Data cleaning with
dplyr
andtidyr
Data filtering and aggregation
Data transformation with
dplyr
Week 7: Data Exploration (2)
- Missing values & Outliers
- Descriptive statistics
- Grouping and summarizing data
- Joining datasets
- Exploratory data analysis (EDA)
Week 8: Data Visualization (1)
Introduction to
ggplot2
for data visualizationGrammar of graphics with
ggplot2
Customizing plots with themes and scales
Adding labels, titles, and legends
Creating different types of plots (scatter plots, bar plots, etc.)
Week 9: Data Visualization (2)
Advanced
ggplot2
techniquesVisualizing distributions and relationships
Faceting and multi-panel plots
Plotting time series data
Interactive plots with
plotly
orggplotly
Week 10: Officially No Class (Public Holiday)
Week 11: Mid-term QZ
Week 12: Interactive Web: Shiny
What is
Shiny
?Creating Shiny apps with R
Adding interactivity to data visualizations
Week 13: Version Control and Collaboration
Introduction to Git and GitHub
Collaborating with others using version control
Best practices for organizing and documenting data science projects
Working with AI (feat. ChatGPT)
Week 14: Reproducible Research
Introduction to
Quarto
Creating a Quarto website with R Markdown
Customizing the website layout and design
Publishing and sharing your Quarto website
Week 15: Project Consultation
Week 16: Project Presentation
Course management
Lecturer: Changjun Lee (Associate Professor in SKKU School of Convergence)
- changjunlee@skku.edu
TA: Ye Seo Lim (Master Student, SKKU Immersive Media Engineering)
- ivisy6952@g.skku.edu
Time:
(1h): Flipped learning content
(2h): Tue 09:00 ~ 10:50
Location: International Hall High-Tech e+ Lecture Room (9B312)
Class consists of Pre-class, Class, and PBL project
Pre-class
Students are required to watch the recorded lecture (or other assigned videos) about the concept of data science and the programming language before the offline class as part of their self-study.
To assess their understanding, students may occasionally be required to submit discussion responses.
Class
The lecturer will summarize the pre-class lecture and provide additional explanations.
Students will be asked questions about the pre-class content to assess their self-study efforts.It is acceptable to answer incorrectly, but failure to respond at all will affect their pre-class discussion score.
Students will also practice advanced coding techniques during the class.
PBL project
Students organize teams that meet several conditions.
4~5 members in a team
Background diversity: no homogeneous majors in a team
Exception: Allowed if persuasion is possible for sufficient reasons
Data will be given. Teams are going to choose the data they want to explore considering their interest
Teams can offer a zoom meeting with lecturer if they need
Final outputs (An example not limited)
Data Preparing (or Collecting)
Explore data (Descriptive stats)
Set your hypothesis (or research questions)
Visualize data to confirm your hypo or RQs
Explain your findings
Expanding your findings to implications
Textbooks for the course
- R4DS: R for Data Science (written by Hadley Wickham and Garrett Grolemund)
- is an excellent resource for learning data science using R, covering data manipulation, visualization, and modeling with R. The book is available as a free online resource.
- RC2E: R Cookbook (written by JD Long and Paul Teetor)
- is a comprehensive resource for data scientists, statisticians, and programmers who want to explore the capabilities of R programming for data analysis and visualization.
- RGC: R Graphic Cookbook (written by Winston Chang)
- is a practical guide that provides more than 150 recipes to help you generate high-quality graphs quickly, without having to comb through all the details of R’s graphing systems
- MDR: Statistical Inference via Data Science (Modern Dive) (written by Chester Ismay and Albert Y. Kim)
- is a comprehensive textbook that provides an accessible and hands-on approach to learning the fundamental concepts of statistical inference and data analysis using the R programming language.
- ISR: Introductory Statistics with R (written by Peter Dalgaard)
- is a great resource for learning basic statistics with a focus on R programming. This book covers a wide range of statistical concepts, from descriptive statistic
Grading Criteria
See Course intro in Week 1
- Attendance & Participation (20%)
Attendance is mandatory. Tardiness will result in a deduction of 0.5 points, and absences will result in a deduction of 1 point. Students who miss more than one-third of the classes will receive an F grade.
Quiz (40%)
A Quiz will be conducted to assess students’ understanding of the material covered.Project (40%)
The project will evaluate students’ ability to apply concepts and skills learned in class. A portion of the project grade will be weighted based on peer evaluation, reflecting team collaboration and individual contributions.
Communication
Notices & Questions
Please join Kakao open-chat room
When you enter, please make sure to enter your name as it is on the attendance sheet. (입장하셔서 이름을 꼭 출석부에 있는 이름으로 설정해주세요.)
Personal counsel (Scholarship, recommendation letter, etc.)
CJ-counselling room (Anything but the class content)