This course introduces students to the fundamentals of spatial data science. The first part of the course introduces students to a high-level programming language (currently R). The second part covers methods to incorporate spatial data into data science workflows. The third part addresses the generation of dynamic, reproducible research output including figures, maps, manuscripts, and websites. The course includes a project for students to conduct spatial analysis related to their research. Familiarity with basic GIS concepts (raster, vector, geographic projection, etc.) will be assumed, but no prior coding experience is required.
Professor Adam M. Wilson (http://wilsonlab.io)
Office Hours: Thursdays 8:30-9:30am or by appointment in Wilkeson 120
3 Credit Hours
Tuesdays/Thursdays 2:00-3:20pm (live and in person)
The course will focus on programming in the R language. Typical class sessions will consist of a short (<30 minute) lecture followed by interactive exercises and activities. All class activities will use RStudio.
Course announcements and other materials will be distributed through UBLearns and/or our Slack Channel. Please check the sites regularly (or enable notifications).
During the course we will complete class exercises on your personal laptop (under any Mac, Linux, or Windows). If you do not have access to a laptop, please let the professor know as soon as possible.
During the week, I will attempt to respond to emails within 48 hours of receiving them (not including weekends). Do not expect an immediate response (please plan accordingly). For example, do not send an email with a question about an assignment the same day that the assignment is due. If you send an email over the weekend, do not expect any response until Monday or Tuesday.
Successful completion of this course will enable the student to:
These learning outcomes are related to those expected of students completing the Geography program.
The major course components are as follows:
Several mini-courses will be assigned via DataCamp throughout the semester. These assignments will be graded as pass/fail (pass if you finish the course, fail if you don’t). These typically take about 3-5 hours per course, but some people report taking much longer (up to 8 hours). You can use the ‘hints’ provided to complete the exercises (but try not to!). See full DataCamp Description for more details.
The course includes many tasks that are performed both in and out of class (see the tasklist). You will ‘commit’ evidence of completing these tasks to your course respository on GitHub.
Most weeks we will work on a ‘case study’ project alone or in small groups. Typically these are open-ended mini-projects in which you use your new skills to perform a task related to spatial data science.
Most weeks we will spend 15-30 minutes in a team meeting where you will discuss the previous case study in small groups. Each group will have a ‘leader’ who facilitates the meeting and shares his/her/their solution to the case study. To successfully perform as a leader, you must:
Each student will have the opportunity to introduce a R-related resource in a 5 minute presentation during class. Most students choose to describe a R package that does something they are interested in, but you could also introduce us to other kinds of resources (useful online forums, web resources, online textbooks, etc). See here for more information about the resource presentation.
The final project will consist of a poster-length reproducible analysis published in html format. This project can be related to the student’s own research or a separate topic.
Individual tasks in the class will not be traditionally graded. If your work meets the specified criteria you will get full credit and only then (there is no partial credit on tasks).
In a specifications-grading system all tasks are evaluated on a high-standards pass/fail basis using checklists of task requirements and expectations. Letter grades are earned by passing marks on a set of tasks. This system provides for a variety of choice and is closer to how learning, and work, is done in the real world. It will be easy to tell if work is complete, done in good faith, and consistent with the requirements. The definitive word is “complete”. Starting them or getting them almost done is not completing.
Grade | Class Tasks | Case Studies | Team Leader | Data Camp | Resource Presentation | Semester Project |
---|---|---|---|---|---|---|
A | 12 | 11 | 3 | 8 | yes | yes |
A- | 11 | 10 | 3 | 8 | yes | yes |
B+ | 10 | 9 | 2 | 7 | yes | yes |
B | 9 | 8 | 2 | 7 | no | no |
B- | 8 | 7 | 1 | 6 | no | no |
C | 7 | 6 | 0 | 6 | no | no |
C- | 6 | 5 | 0 | 5 | no | no |
D | 5 | 4 | 0 | 4 | no | no |
Near the end of the semester, you may be asked to complete a coding assessment via DataCamp. It is an adaptive assessment tool that measures your data science skill level in R. The assessment will take about 10 minutes to complete (if you succeed the first time). After completing the assessment, you will receive an assessment score and percentile ranking, your skill level, an overview of your strengths and skill gaps, and personalized course recommendations for areas of improvement.
There will be no final exam.
We will read parts of R for Data Science and Geocomputation with R which are both available online. All additional materials will be available through the course website.
There is not a strict definition of on-time in this course. In general, on-time means that you have come to class with the reading and tasks complete so that you can actively participate in the conversation. You have to define prepared for class. You should note that the workload in this course does not allow you to fall behind. If you blow off a week, it will be challenging to catch back up.
This class will include ample opportunities for in-class discussion and you should attend every class session unless you have a valid excuse (as defined by the University at Buffalo’s class attendance policy:
Students may be justifiably absent from classes due to religious observances, illness documented by a physician or other appropriate health care professional, conflicts with university-sanctioned activities documented by an appropriate university administrator, public emergencies, and documented personal or family emergencies. The student is responsible for notifying the instructor in writing with as much advance notice as possible.
If you miss a class session, you are still responsible for completing the class content/assignments. Please consult with a classmate to see if there was any important information not included in the online materials.
See the University website for cancellations/delays due to weather or other unforeseen events (http://emergency.buffalo.edu/).
Academic integrity is a fundamental university value. Through the honest completion of academic work, students sustain the integrity of the university and of themselves while facilitating the university’s imperative for the transmission of knowledge and culture based upon the generation of new and innovative ideas. For more information, please refer to the Graduate Academic Integrity policy. Examples of academic dishonesty include: submitting work from another course, plagiarism, cheating, falsification, misrepresentation, and usage of confidential documents. Writing computer code often involves use of existing code chunks (e.g. copying an example from the documentation) or, more recently, using generative AI tools (such as chatGPT) to complete tasks. In an academic setting, this complicates the identification and definition of academic dishonesty. The primary goal of the course is to learn how to program and think as a data scientist concerning data wrangling and visualization. I want you to use your time as efficiently as possible to meet this goal.
Use of generative AI tools such as ChatGPT is allowed in this course. However, you are responsible for understanding and being able to reproduce all code you submit without the use of these tools. This means that you can use AI-tools to learn how to do something, but you can’t use them to do tasks for you. There is a fine line between these two uses, so here are some examples to help guide you:
for
loop and then writes their own for
loop
for the assignment. Sam did not violate the Academic Integrity
policy.With this in mind here are some guiding principles.
# Got help for the next three lines of code from Jason's Task 12 script
.
If you used extensive AI prompts to figure something out, paste the full
prompt as a comment. This will also help you learn how to write a good
prompt.If there is reason to believe that submitted code was simply copied from elsewhere, the student will be asked to verbally (and specifically) explain the code used in the analysis to ensure comprehension. They may also be asked to complete a coding challenge to demonstrate their abilities.
If a student is suspected of academic dishonesty, then a three-step consultative resolution will be employed. First, the instructor will notify the student of the incident and arrange a meeting. Second, the instructor will orally inform the student of the sanction, which could include: warning, revision, reduction in grade, or failure of course. Third, the instructor will provide the student with a written copy of the decision. See the university policy for more information (https://catalog.buffalo.edu/policies/integrity.html). Please review it and ask if you have any questions.
If you have any disability which requires reasonable accommodations to enable you to participate in this course, please contact the Office of Accessibility Resources in 60 Capen Hall, 716-645-2608 and also the instructor of this course during the first week of class. The office will provide you with information and review appropriate arrangements for reasonable accommodations, which can be found on the web at: http://www.buffalo.edu/studentlife/who-we-are/departments/accessibility.html.
Course content is designed to be flexible to accommodate student interest and abilities. The order and timing of course topics may change as the semester progresses. See the course schedule on the website for detailed course content.