Jupyter Notebook Development Team#

Overview#

The Jupyter Notebook Development Teams allow undergraduate students to apply what they learn in data science courses in the production of teaching resources. Each team has a student Team Lead who is an experienced Jupyter Notebook developer. The teams collaborate with instructors from across campus to build and deploy the Jupyter Notebook in Applied Instructional Modules. Instructors teach DS Modules in one or multiple sessions of a course.

Target Audience#

The Jupyter Notebook Development Team program assembles undergraduate student teams through an open application process. Students with a mix of data science skills, pedagogy knowledge, and a passion for a domain area are well-suited for the program. All students are part of an apprenticeship model of near-peer learning in their teams. There is a two-tier system of participation. Students begin by volunteering or working for credits and then can move on to the second tier. Students that have completed at least one semester as a team member are eligible to become a Jupyter Development Team Lead. These students receive pay for overseeing the development of the Jupyter Notebook with their team.

Goals#

Notebook teams develop essential course materials. Their process and production with faculty result in real-time open data science educational content. This content provides added value to existing courses. With this added value entering the campus community, moreover, the outreach and conversations about data science and its programs at UC Berkeley increase their reach institutionally.

The DS Modules Program Coordinator for the UC Berkeley Division of Computing, Data Science, and Society oversees the Development Teams. Before building the DS Module, the DS Module Program Coordinator writes a contract that includes the expectations of the faculty member and the responsibilities of the development team. There are usually two types of faculty that are entering the process.

  • The first group of faculty are generally interested but do not know anything about data science. For these faculty, the DS Modules Program Coordinator will work with them and the team of students to find and decide on an appropriate data set and plan how this will best fit into the course they are teaching. The development team can assist in teaching the material in the lecture, sections, or both.

  • The second group is faculty who are very familiar with data science and quickly move through the process. These faculty take on more of the planning. They decide how the DS Module can function and the deployment in the course.

Once the faculty signs the contract that includes the guidelines for the course, the team of students will begin to work through the development of the DS Module together.

Key Diversity and Inclusion Practices and Strategies#

The Division’s student teams recruit from disciplines all across campus to give undergraduates opportunities to lead, form connections, and shape the Berkeley data science community. Teams take on projects in the curriculum, internal operations, analytics, and more. Team structure changes semester to semester based on where students see opportunities for the Division to grow. A holistic, codified application review process helps teams prioritize potential members with a passion for the field and a belief in a growth mindset in relation to technical skills and experience.

Program Description#

Applied content in data science is made available by the work of the Jupyter Notebook Development Teams. They work in small groups on JupyterHub to stay thoughtfully involved in the ongoing process of applied data science lessons for entry level data scientists. Neer-peer learning allows students to move through apprenticeship level responsibilities as they move from novice (first tier) experiences into the lead of Team Lead (second tier). The student team members create course content gaining a sense of community, collaboration skills, and professionalization imbued in a growth mindset structure as they provide difficult content into accessible chunks.

Example

  • Stage 1 (Before the start of the semester): The contract is written and a meeting is held by the domain-specific DS Modules Program Coordinator, the Data Science Undergraduate Studies Curriculum Coordinator, any Graduate Student Instructors, and the faculty.

  • Stage 2 (First week of the semester): Domain-specific DS Modules Program Coordinator sends interest forms to the developers.

  • Stage 3 (Middle of the semester): The domain-specific DS Modules Program Coordinator connects the faculty and student team members over email. This email includes guidelines regarding the minimum necessary requirements of collaboration for the Jupyter Notebook development. The faculty is made aware that they should expect an email weekly, be prepared to review the first version, and review the final Jupyter Notebook.

  • Stage 4: The Team Lead becomes the coordinator of communication with the development team and the faculty. The Team Lead also stays in close contact with the domain-pecific DS Modules Program Coordinator.

    • If the Development Team Lead does not receive a response to an initial and a follow-up reminder email, they will contact the Program Coordinator. The Program Coordinator will then reach out.

    • The faculty needs the capacity to be able to answer questions about the set deployment date of the domain-specific DS Module weeks in advance as the Jupyter Notebook is developed. Timely feedback and requests for changes need to be sent early so that students have the time in their course and schoolwork schedule to make the updates.

Additional Guidance for Implementation#

A critical factor in moving this program from university to university is the infrastructure. UC Berkeley builds a data science program that uses existing campus resources, a specific group of staff, and agreed-upon guidelines.

  • There is an assumption that all students have access to a laptop or Chrome book because the library has a lending program.

  • The program currently uses the previously existing Data Hub and data puller.

  • UC Berkeley’s program has an instructional designer with some data science teaching experience. Having someone who devotes their full time and has a knowledge of teaching foundational data science is a crucial component of the Jupyter Notebook Development Teams. Many of the faculty would like to add data science content into their courses through DS Modules and work with the development teams but do not know how to begin. Having staff with this experience provides someone with background work in seeking and preparing an appropriate data set and assisting in the translation of the learning experience for the Development Team.

  • UC Berkeley’s program has specific foundational criteria for their data sets. It must be appropriate content licensed for their use and that students can work efficiently. Again, if the faculty does not have a background in data science, this might be an impossible feat independently.

Recommendations#

Finally, one idea for the program that is not yet deployed is the creation and repetition of DS Module development using templates. A set of DS Module templates would include domain agnostic templates for linear regression, cleaning data, hypothesis testing, and other familiar topics. The Jupyter Notebook Development Teams would build specific DS Modules using the models. They may be more formulaic than the current process at UC Berkeley, but it could make the management of DS Module development simpler.