Syllabus

Course Description

This course introduces students to the fundamentals of spatial data science. The first part of the course introduces students to a high-level programming language (currently R). The second part covers methods to incorporate spatial data into data science workflows. The third part addresses the generation of dynamic, reproducible research output including figures, maps, manuscripts, and websites. The course includes a project for students to conduct spatial analysis related to their research. Familiarity with basic GIS concepts (raster, vector, geographic projection, etc.) will be assumed, but no prior coding experience is required.

Logistics

Professor Adam M. Wilson (http://wilsonlab.io)

Office Hours: Thursdays 8:30-9:30am or by appointment in Wilkeson 120

3 Credit Hours

Schedule

Tuesdays/Thursdays 2:00-3:20pm (live and in person)

Course Structure

The course will focus on programming in the R language. Typical class sessions will consist of a short (<30 minute) lecture followed by interactive exercises and activities. All class activities will use RStudio.

Course Communication (UBLearns & Slack)

Course announcements and other materials will be distributed through UBLearns and/or our Slack Channel. Please check the sites regularly (or enable notifications).

Computer requirements

During the course we will complete class exercises on your personal laptop (under any Mac, Linux, or Windows). If you do not have access to a laptop, please let the professor know as soon as possible.

Email Policy

During the week, I will attempt to respond to emails within 48 hours of receiving them (not including weekends). Do not expect an immediate response (please plan accordingly). For example, do not send an email with a question about an assignment the same day that the assignment is due. If you send an email over the weekend, do not expect any response until Monday or Tuesday.

Student Learning Outcomes

Successful completion of this course will enable the student to:

  1. convert data from varied formats/structures to desired format for analysis and visualization
  2. clean, transform, and merge data attributes/variables appropriately
  3. effectively display and communicate meaning from spatial, temporal, and textual data
  4. use current analysis, presentation, and collaboration tools in the spatial data science field.

These learning outcomes are related to those expected of students completing the Geography program.

Course Requirements

The major course components are as follows:

Data Camp courses

Several mini-courses will be assigned via DataCamp throughout the semester. These assignments will be graded as pass/fail (pass if you finish the course, fail if you don’t). These typically take about 3-5 hours per course, but some people report taking much longer (up to 8 hours). You can use the ‘hints’ provided to complete the exercises (but try not to!). See full DataCamp Description for more details.

Tasks

The course includes many tasks that are performed both in and out of class (see the tasklist). You will ‘commit’ evidence of completing these tasks to your course respository on GitHub.

Case Studies

Most weeks we will work on a ‘case study’ project alone or in small groups. Typically these are open-ended mini-projects in which you use your new skills to perform a task related to spatial data science.

Team Leadership

Most weeks we will spend 15-30 minutes in a team meeting where you will discuss the previous case study in small groups. Each group will have a ‘leader’ who facilitates the meeting and shares his/her/their solution to the case study. To successfully perform as a leader, you must:

  1. complete the class tasks / case studies before class starts
  2. be prepared to lead the discussion of the case study

See here for more information about case study leadership.

Resource presentation

Each student will have the opportunity to introduce a R-related resource in a 5 minute presentation during class. Most students choose to describe a R package that does something they are interested in, but you could also introduce us to other kinds of resources (useful online forums, web resources, online textbooks, etc). See here for more information about the resource presentation.

Final Project

The final project will consist of a poster-length reproducible analysis published in html format. This project can be related to the student’s own research or a separate topic.

Weekly Rhythm

  1. Preparation
    1. Read Assigned Material
    2. Work on class tasks and DataCamp assignments
    3. Submit questions for class discussion on Slack
  2. Class Time Tuesday
    1. Meet with your Case Study Teams
  3. Continue Preparation
    1. Finish case studies and prepare to present
    2. Work on class tasks and DataCamp assignments
    3. Submit questions for class discussion on Slack
  4. Class Time Thursday
    1. Updates & Questions from reading and daily class tasks [~10 minutes]
    2. Student Resource Presentation(s) [20 minutes]
    3. Case Study Presentation [30 minutes]
      • One group selected to share solution
      • Other groups share other approaches / solutions
      • General discussion about methods
    4. Case Study Introduction (for following week) [20 minutes]
  5. Rinse and Repeat

Grading

Individual tasks in the class will not be traditionally graded. If your work meets the specified criteria you will get full credit and only then (there is no partial credit on tasks).

In a specifications-grading system all tasks are evaluated on a high-standards pass/fail basis using checklists of task requirements and expectations. Letter grades are earned by passing marks on a set of tasks. This system provides for a variety of choice and is closer to how learning, and work, is done in the real world. It will be easy to tell if work is complete, done in good faith, and consistent with the requirements. The definitive word is “complete”. Starting them or getting them almost done is not completing.

Grade Class Tasks Case Studies Team Leader Data Camp Resource Presentation Semester Project
A 12 11 3 8 yes yes
A- 11 10 3 8 yes yes
B+ 10 9 2 7 yes yes
B 9 8 2 7 no no
B- 8 7 1 6 no no
C 7 6 0 6 no no
C- 6 5 0 5 no no
D 5 4 0 4 no no

Coding Challenge

Near the end of the semester, you may be asked to complete a coding assessment via DataCamp. It is an adaptive assessment tool that measures your data science skill level in R. The assessment will take about 10 minutes to complete (if you succeed the first time). After completing the assessment, you will receive an assessment score and percentile ranking, your skill level, an overview of your strengths and skill gaps, and personalized course recommendations for areas of improvement.

  1. To pass the challenge, you must achieve a score of 70 or greater on the assessment. You may take it multiple times until you achieve this score.
  2. Failure to pass the challenge (70+) will lower your grade 1-2 steps from the grade earned via your completed tasks.

There will be no final exam.

Semester Deliverables

All items are due on the last day of exam week (though earlier is fine too)

  • GitHub Profile
    • Course Repositories
      • Tasks
      • Case Studies
  • Grade Request
    • You will submit a cover letter (<1 page) stating:
      • the key concepts and techniques that you learned during the course (~500-1,000 words)
      • a semester task form that records your completed tasks during the semester. This is something you compile based on the assignments that you completed.
      • the score you earned on the DataCamp assessment course described above.
      • a grade request based your completion of course tasks (using the table above).
  • Final project website

Textbook

We will read parts of R for Data Science and Geocomputation with R which are both available online. All additional materials will be available through the course website.

On-time Completion

There is not a strict definition of on-time in this course. In general, on-time means that you have come to class with the reading and tasks complete so that you can actively participate in the conversation. You have to define prepared for class. You should note that the workload in this course does not allow you to fall behind. If you blow off a week, it will be challenging to catch back up.

Attendance Policy

This class will include ample opportunities for in-class discussion and you should attend every class session unless you have a valid excuse (as defined by the University at Buffalo’s class attendance policy:

Students may be justifiably absent from classes due to religious observances, illness documented by a physician or other appropriate health care professional, conflicts with university-sanctioned activities documented by an appropriate university administrator, public emergencies, and documented personal or family emergencies. The student is responsible for notifying the instructor in writing with as much advance notice as possible.

If you miss a class session, you are still responsible for completing the class content/assignments. Please consult with a classmate to see if there was any important information not included in the online materials.

See the University website for cancellations/delays due to weather or other unforeseen events (http://emergency.buffalo.edu/).

Academic Integrity & Generative AI

Academic integrity is a fundamental university value. Through the honest completion of academic work, students sustain the integrity of the university and of themselves while facilitating the university’s imperative for the transmission of knowledge and culture based upon the generation of new and innovative ideas. For more information, please refer to the Graduate Academic Integrity policy. Examples of academic dishonesty include: submitting work from another course, plagiarism, cheating, falsification, misrepresentation, and usage of confidential documents. Writing computer code often involves use of existing code chunks (e.g. copying an example from the documentation) or, more recently, using generative AI tools (such as chatGPT) to complete tasks. In an academic setting, this complicates the identification and definition of academic dishonesty. The primary goal of the course is to learn how to program and think as a data scientist concerning data wrangling and visualization. I want you to use your time as efficiently as possible to meet this goal.

Use of generative AI tools such as ChatGPT is allowed in this course. However, you are responsible for understanding and being able to reproduce all code you submit without the use of these tools. This means that you can use AI-tools to learn how to do something, but you can’t use them to do tasks for you. There is a fine line between these two uses, so here are some examples to help guide you:

Examples

  • Sam uses chatGPT to generate examples of how to write a for loop and then writes their own for loop for the assignment. Sam did not violate the Academic Integrity policy.
  • Alex pasted the assignment into chatGPT and then submitted the response in class. They did not understand the code and could not reproduce it on her own. Alex violated the Academic Integrity policy.
  • Mohammed pasted the assignment into chatCPT and then used the response to learn which functions he could use and how the functions work. He submitted a version that was very similar (or identical) to chatGPT’s response, but he understood how it worked and was able to reproduce the code without AI when asked. Mohammed did not violate the Academic Integrity policy.
  • Jon found his classmate’s repositories on Github and copy-pasted their code into his assignment and didn’t understand a few parts of it. Jon violated the policy.
  • Sumaiya was stuck on a problem and reviewed a classmate’s code for ideas. She used a few functions from the code but understood how it worked and was able to reproduce it later. She did not violate the policy.

Guidelines

With this in mind here are some guiding principles.

  • You may use AI tools (and classmates code) to learn how to do something, but you cannot simply copy/paste and submit code you didn’t write as your own. If you copy and paste code from another person or AI to complete a task and can’t recreate the script on your own or understand what it is doing you are cheating.
  • If you look at others’ code or use AI to get help solving a problem you must add comments to your code indicating what you did. For example, # Got help for the next three lines of code from Jason's Task 12 script. If you used extensive AI prompts to figure something out, paste the full prompt as a comment. This will also help you learn how to write a good prompt.

If there is reason to believe that submitted code was simply copied from elsewhere, the student will be asked to verbally (and specifically) explain the code used in the analysis to ensure comprehension. They may also be asked to complete a coding challenge to demonstrate their abilities.

If a student is suspected of academic dishonesty, then a three-step consultative resolution will be employed. First, the instructor will notify the student of the incident and arrange a meeting. Second, the instructor will orally inform the student of the sanction, which could include: warning, revision, reduction in grade, or failure of course. Third, the instructor will provide the student with a written copy of the decision. See the university policy for more information (https://catalog.buffalo.edu/policies/integrity.html). Please review it and ask if you have any questions.

Accessibility Resources

If you have any disability which requires reasonable accommodations to enable you to participate in this course, please contact the Office of Accessibility Resources in 60 Capen Hall, 716-645-2608 and also the instructor of this course during the first week of class. The office will provide you with information and review appropriate arrangements for reasonable accommodations, which can be found on the web at: http://www.buffalo.edu/studentlife/who-we-are/departments/accessibility.html.

Course Content

Course content is designed to be flexible to accommodate student interest and abilities. The order and timing of course topics may change as the semester progresses. See the course schedule on the website for detailed course content.