Assignment 3: Numpy and Testing#
Introduction#
In this assignment, you receive code that does a simple Monte Carlo simulation on the effect of measurement error in a linear regression.
Many people say that code for Monte Carlo simulation is untestable. After all, we simulate because we do not know what will come out of the simulation. To make matters worse, randomness is involved.
In this assignment, your task is to prove these people wrong by producing well-tested Monte Carlo code.
To avoid disappointments, here are a few rules for all tasks:
Write good commit messages and commit frequently. Use git for the entire process. Do not hesitate to commit unfinished or broken code. Git should be the only way to share code with your peers.
You only get points if you contribute. If you don’t commit at all or your only commit trivial stuff (like fixing a typo in a comment) you will not get points, even if your group provides a good solution.
All functions need docstrings.
Functions must not have side effects on inputs.
Do not commit generated files (e.g., plots).
Follow the rules for working with file paths (i.e., relative path with
pathlib
).Use the “modern pandas” settings for all exercises.
Do not use global seeds for random number generation.
The deadline is December 8, 11:59 pm
Apart from written answers in the README.md file, the entire solution has to consist of
.py
files. If you find it easier, you can prototype some of the functions in Jupyter
notebooks. In that case, it is a good idea if each group member has their own notebook,
so you do not get merge conflicts.
After some tasks we ask you to commit and set a tag in git. This is so we can easily check out the version of your repo after this step. This video explains all you need to know about tags. The tags should always be set on the main branch.
Important#
Please do the tasks in the order in which they appear!
Task 1#
Follow this link. Create the repository for your group and clone it to your computers.
Task 2#
Look at the starter code in run.py
. Explain to each other what the code does. Run the
file and and look at the plot it produces. Discuss in the group what you like and
dislike about the code in run.py
. Write it down in bullet points in your README.md
file.
Task 3#
A minimal requirement to make the Monte Carlo code testable is to put it into a
function. Hence you will implement the do_monte_carlo
function in monte_carlo.py
.
The inputs defined at the top of
run.py
will become function arguments.Data will be returned by the function.
Write a docstring for the function that describes all inputs and outputs. Use a google style docstring, i.e. the formatting that you have seen in docstrings we previously gave you.
Import the new function in
run.py
and call it with the inputs.
Task 4#
Set the tag v1.0
and push it to GitHub
Task 5#
If only one x-variable is measured with classical measurement error, you get attenuation
bias, i.e., the parameter of the variable with measurement error is biased towards zero.
Use this fact to write a test for the do_monte_carlo
function in the module
test_monte_carlo.py
You can use similar inputs as the ones used in run.py
but maybe make them smaller
(e.g. fewer parameters) to make the test faster.
Task 6#
Is this test enough? What are possible drawbacks? What is good about this test? Write
down some bullet points in README.md
. Relate it to what you learned in the
screencasts.
Task 7#
Go over each line in monte_carlo.py
and decide whether the output of that line is
deterministic or random. Example:
epsilon = rng.normal(...)
is random.y = x @ true_params + epsilon
is deterministic, even though epsilon is random. The reason is that given a value of epsilon, all calculations are deterministic.
Task 8#
Split the code into smaller subfunctions. The names of the subfunctions should start
with an underscore to mark that they are not meant to be called directly by a user. The
interface and behaviour of do_monte_carlo
should not change. The test you already
wrote should continue to pass. When splitting the code into functions, keep the
following trade-offs in mind.
It is easier to come up with test cases if functions are very very small. However, in the extreme case where each function is executing a single operation, your unit tests rely on the implementation and not just interfaces.
Lines with randomness make it harder to test a function. For that reason, they should be separated from deterministic lines. However, code becomes more readable by grouping into a function all lines that comprise a single logical step.
Write docstrings for all new functions. Except for the docstrings, there should be no need for comments in the code after you are done. If you still think comments are necessary, it is a sign that your function names are not good.
Task 9#
Write tests for all new functions. If you want, describe your reasoning in answers.md. For example, how do you test functions with randomness?
Task 10#
Set the tag v2.0 and push it to GitHub.
Task 11#
Until now, we have mainly thought about the happy path where we have a perfect user who
only provides valid inputs. Add some test cases with invalid inputs (e.g. a negative
standard deviation for the measurement error or a parameter vector that contains
strings). Those tests are expected to fail, for now, so mark them with
@pytest.mark.xfail
.
Task 12#
For each xfailed test, think about which type of exception you would like to get and
what the message should tell you. Use with pytest.raises()
to actually test that you
get the right exception. The test will still fail because you have not implemented the
error handling yet.
Task 13#
Set the tag v3.0
and push it to GitHub.
Task 14#
Remove the @pytest.mark.xfail
decorators one by one. Run the test and let it fail
once, just to get the experience of test driven development.
Now implement the error handling and make the test pass.
Follow all rules for error handling from the screencasts, i.e. think about a good place for doing the error handling, use fail functions and make sure there is no redundant error handling.
Task 15#
Set the tag v4.0
and push it to GitHub.