In-class exercise#

The in-class exercise is distributed via a GitHub Classroom repository. To get access to your group’s git repository, you can follow this link.

Task 1#

Check that your system satisfies the necessary prerequisites for the templates. Note that with the possible exception of LaTeX, this should be the case. Also make sure to be in a folder without spaces.

Accept the assignment, create the repository, and clone it to your computer.

Task 2#

This task should only be done by one group member!

Follow the steps outlined in the project template documentation on how to customise the template for your project. For this task, only work on the task “First step: Rename the project”.

Once you’ve done this, commit your changes and push them.

Task 3#

Now all team members (except for the one who did the previous task) can clone the exercise repository and run the project.

Go to the Running the project section of the documentation and follow the instructions.

For now, ignore everything related to pre-commit hooks. We will see them next session. This means that you should not type pre-commit install. If you have accidentally done it, use pre-commit uninstall.

If you previously had issues with kaleido, you will have to repeat the workaround in the new environment. If you have issues due to LaTeX, make sure you have a modern distribution installed and that your installation paths are valid.

Importantly:

  1. Do not change and/or commit any files during this step.

  2. Do not continue with any other task until the entire project can be built without errors for all members of your group.

Task 4#

This task should only be done by one group member!

Follow the instructions in the documentation on how to customise the template for your project.

Once you’ve done this, commit your changes and push them. Make sure that all team members pull the changes.

Task 5#

The goal is to learn about an important pitfall when working with project templates.

  • Run the project at least once if you have not done so in the previous step.

  • Open the file task_data_management_template.py in the data management subfolder. Change the name of the generated file from stats4schools_smoking.pickle to data_cleaned.pickle (in the argument to the produces keyword).

  • Run the project again. It will pass.

  • Now delete the entire bld folder and run the project again. You should get an error. Discuss why this happened.

  • Describe other situations in which this problem can occur.

Task 6#

Set a breakpoint before the line sr = sr.replace(replace_mapping) inside the function _clean_highest_qualification in stats4schools_smoking_template.py. Use the debugger to investigate the following questions and answer them in your readme file:

  • Do we need to include any other categories in replace_mapping?

  • Why or why not?

  • Why is ordered_qualifications longer than replace_mapping?

Make sure you have read the section on setting breakpoints in task files

Task 7 (Bonus, maybe do it at home)#

  • Go through the setup of the project templates again, adding a new repository to your own GitHub account in the process.

  • Remove the template files and add all files from last week’s pytask exercise.

  • Replace all path definitions by proper path handling with a config file.

  • Make sure that pytask runs through.