May 30, 2013 3 min read

Learning From Machine Learning

Designing effective curricula

Cushing Memorial Library and Archives, Texas A&M University, 1949

My fondest memories of middle-school, circa 2005, were spent hunched over in a small, underfurnished room with eight other teenagers, writing Pascal — a dated programming language — the blue glow of the terminal background reflecting on our faces. I can still recall the whirring sound of MS-DOS era computers with “Turbo” buttons and CRT screens. This was my introduction to programming. At the low rate of $130 per semester, it promised to turn mathematically curious computer geeks into software developers.

The fact that this class existed outside of the standard curriculum was a blessing. Without the concern for covering everything that might appear on an end-of-year exam, the teacher could focus on her main goal: teaching us to code. To achieve that, she helped us develop a process and a way of reasoning. We didn't memorize arcane symbol combinations or variable name guidelines, we weren't even explicitly taught to test programs. We would read programs and then write some. Whenever there was an issue, the teacher would nudge us in the right direction with a guiding clue or, should the error be particularly obvious, a quick smile. We were learning all the constituent skills — design, implementation, testing and debugging — but the tasks remained singular. “Write a program that sorts numbers” or “draw a cube”.

Without the distraction of weekly assessments we remained engaged and entertained throughout the semester. After independently transitioning to more modern programming languages, we went on to win competitions and complete our own software projects. This happened even as some of us, including myself, were getting average marks at school.

The classes in school lacked the feedback. Nothing would start moving on a screen when solving math equation. No tangible change in the world from our work and no intristic motivation to do it other than to pass tests of dubious value.

Remember the multiple choice test that had you memorize hundreds of past paper answers that were forgotten a month later. Even free-format university essays become meaningless when the optimal strategy for earning high marks becomes paraphrasing ideas from your professor's thesis.

The problem with these tests is evident, the solution less so. It has troubled not just professors, but also computer scientists that seek to educate machines — researchers in the field of machine learning.

Machine learning helps us design algorithms that improve, or “learn”, with the addition of data. The state-of-the-art of machine learning is alarmingly similar to how we educate humans. One shows the computer a set of completed tests, i.e., a past paper, and it learns how to answer similar tests. Provided the practice papers have good coverage, we expect the algorithm to crack the real test.

Unlike materials for educating humans, it is not only possible to measure the quality of a curriculum, but straightforward. There is a strong symptom of a bad one called overfitting, and it can be systematically recognized.

Overfitting occurs when the performance on practice papers exceeds expected performance on the real test.

Imagine showing a student that averages 90% on history papers a completely solved past paper. If you then remove the solution and ask them to take the same paper, the likely result would be significantly higher, maybe 95%. They would, however, return to their normal average of 90% for previously unseen papers. The best machine learning algorithms do not share this property.

Contrary to common sense, computer scientists prefer algorithms that do not ace practice papers they have seen before. Computers are great at memorization, so if they are unable to answer something from a past paper, it means they weren't merely attempting to memorize it. Instead, they used the solutions to develop a generalized decision process. Overfitting aims to recognize and reject algorithms that are biased towards memorization in favor of abstract learners that develop internal mind maps and knowledge structures.

The dichotomy carries over to human learning. At one end, there is the straight A student who has nearly memorized several past papers while ruthlessly limited their learning to the boundaries of the syllabus. On the other, there is the person who comes to truly understand the subject. In fact, if you give them any test that follows the syllabus, they will do well consistently.

In human tests, overfitting cannot be prevented to the same extent. Our memory is active all the time — we cannot consciously forget previously seen answers. Even in an essay question, students can remember a good set of bullet points and examples to work off of, improving their performance on similar topics thereafter.

But while professors are struggling to seek the right adjustments, we should leverage our intelligence. We are free to adapt our own learning processes, they are not fixed prior to the practice papers like machine learning algorithms are. We can look beyond the symptoms and always aim for understanding rather than grades.

Whenever I'm learning, at university or beyond, I still strive to be that 13-year-old with an unwavering and clear sense of what I am really preparing for — life.

Popular tags