A Gentle Introduction to Robotics

Telling People Numbers

“There are lies, damned lies, and statistics,” Mark Twain is said to have said. Darrell Huff, in How To Lie With Statistics, gives lots of ideas on how to lie with numbers. Well, not really lie, but create an incorrect impression. The point of the book, though, is not really to tell you how to lie with numbers. It is to teach you what lying with numbers looks like, so you can recognize when someone is doing it to you. (You’d be amazed, seriously. Once you know what you’re looking for, you see it all the time. It’s like bad kerning that way.) And perhaps, to teach you how to recognize when you are doing it yourself. Accidentally, of course. So this section is about how to avoid lying with numbers, because it’s actually a lot easier than it looks. People don’t actually do numbers very well.

When a decision maker is wrong about what correct numbers mean, and makes organizational decisions based on their misunderstanding, there can be substantial negative results. Typically, as professionals or even as simply reasonable people, we would like not to contribute to organizational dysfunction. So it behooves us to be aware of how we present information, to try to guard against confusion, misunderstanding, and misinterpretation.

Now, you cannot prevent misinterpretation by someone with a vested interest in a misinterpretation. Someone who “just knows” what the answer “should” be is impervious to measurements, observable facts, and reasoning: the reasoning is obviously wrong (as it gets to the wrong conclusion), the measurements are obviously fake and the facts are obviously fabricated (as they support the wrong conclusion). The perceptions will always support the desired conclusion, no matter how torturous the necessary “logic.” But you can guard against some common human cognitive failings by which a disinterested person might be led astray; you can make it apparent to those without the vested interest in the outcome. This may or may not be enough, but it is typically worthwhile.

This section is, more or less, about how people commonly misunderstand data. Try to avoid presenting data in a way that it can be misunderstood.

The Base Rate Fallacy

The mandatory Dilbert.

The base rate fallacy is ignoring what you would see if there were no effect going on. In the Dilbert cartoon, the pointy haired boss has been told that 40% of sick days taken by his workers are Mondays and Fridays. 40% sounds like a lot; in fact, it’s almost half! Obviously the workers are taking fake sick days in order to extend their weekends — Saturdays and Sundays. And yet, Mondays and Fridays together constitute 40% of the five-day work week. If workers took sick days at random — a reasonable result if being sick is itself random — then each of the five days in the work week would, on average, represent 20% of sick days taken; and adding two of these days together — any two — would, on average, represent 40% of sick days taken. And so 40% of sick days being taken on Monday or Friday is not, in fact, evidence of workers slacking off. It is in fact not evidence of anything, because we don’t know the time period of the observation, or the number of sick days taken, or the number of employees, or anything other than just this one fraction. And a fraction — even 0% or 100% — all by itself is meaningless: there is simply not enough information for there to be meaning.

Suppose we want to avoid our information triggering a base rate fallacy. What sort of information would have meaning? At a minimum, the actual number of sick days taken on each of the days of the week. Better would be a bar chart: number of sick days taken on each day of the week, with a bar for each day of the week with height (and so area) proportional to the number of sick days taken that week. This gives exactly the same information as the numbers, but is more easily perceived. Most people are much better at comparing heights and areas right next to one another than at comparing numbers right next to one another: the first we call “intuition” (although it is actually a skill called conservation typically learned in the concrete operational stage of cognitive development in the primary school years) while the second is far more abstract. Best would be an actual statistical test of whether any difference from uniformity is statistically significant — along with the bar chart, because while “p < 0.95” means something very specific to a statistician, it does not to a normal person. Does this protect the entire information flow against the base rate fallacy? No. The guy with an axe to grind will tell everyone that our careful numbers prove 40% of sick days are taken on a Monday or Friday, so fire those slackers! But at least the people we talked with will have a truthful impression — not that that necessarily does anything, unfortunately.

Information Without Context

Presenting information without context is a typical propoganda technique, and is even more effective than the Big Lie technique because the information is actually true. Almost anything can be presented with any emotional tone you want, as long as it is without context. As I write this, wind turbines’ ecological impact is in the news. “Wind turbines generate tens of thousands of waste blades every year that need to be landfilled! This is an ecological disaster!” Well, wind turbines do indeed generate waste. Their blades need to be replaced every decade or so due to wear, and one or a set might get damaged. A 1.5 megawatt turbine goes though about 100 tons of blades in a 25 year lifetime. So a 1.5 megawatt wind turbine generates electricity for only about 1250 households, only about 3162 people, but does this at the horrible cost of as much waste headed to the landfill as five entire ... umm, people. Or two households. (Americans generate 1600 pounds of waste per person per year. American households average 2.52 persons.)

But you have to admit “tens of thousands of waste blades every year” sounds a lot more dramatic than “five people,” doesn’t it? And it’s not that the blades can’t be recycled, it’s just that they aren’t — in the United States. Europe recycles wind turbine blades just fine. And producing the same power with a coal-fired plant would produce 61 thousand tons of ash and 7.2 million tons of carbon dioxide. Maybe the “ecological disaster” part is a bit, well, overstated.

The base rate fallacy is just a human cognitive glitch — almost everyone is at risk from it, and it takes awareness and humility to avoid it. The impact of context-free information is another: once we have accepted knowledge from someone, we tend to also accept the attitude towards that knowledge from the same someone. Related to this is the human succeptibility to the Big Lie: we typically judge truth of information by the earnestness of the teller. If someone very earnestly insists the sky was green yesterday, we will find some reason to believe it: volcanic ash suspended in the air, light from Venus refracting through swamp gas, tornado weather, auroras from a solar storm, whatever; we will find some reason to regard it as plausible. Humans are a social species, with a strong urge to tribalism: we need to have social acceptance, and so extend it to others. The manipulative exploit these cognitive glitches.

So, always present your information with context. What is the impact compared to the alternatives?

Incomparable Numbers

Incomparable numbers is another rich source of confusion among some people. We’ve already talked about how computing an “average star rating” is pandering to innumeracy: “number of stars” is an ordinal scale, not an interval scale, so taking averages simply makes no sense. A friend of mine had a manager who insisted all percentages must add up to 100. Now, percentages as they are typically used do add up to 100, so you might think this is not a problem. But this manager was in charge of equal opportunity monitoring. The manager seriously expected percentage of minority applicants plus percentage of woman applicants to add up to 100, and the manager was convinced something deeply sinister was happening when they did not. Percentages of mutially exclusive alternatives add up to 100: other percentages do not. They are measurements of different things, and the only relationship they might have to one another is “not equal.” This is why surveys of “what computer languages do you use in programming” typically have around 300 percent worth of responses: very few programmers use only a single computer language. So yes, 75% of programmers use Java and 80% use C and 50% use C# and close to 100% use JavaScript, and no, this does not mean the survey is inaccurate. It means programming languages are not mutually exclusive.

Avoiding giving incomparable numbers is frequently difficult, because people pattern match as a fundamental cognitive activity. Two things are represented as percentages — therefore the two things are alternatives to one another because that’s what percentages are for. Usually. One can provide context indicating that the numbers are incomparable — here is a pie chart of ethnicity of applicants, here is a separate pie chart of sex of applicants, here is yet another pie chart of age ranges of applicants. It’s harder when the alternatives are not mutually exclusive.

The mandatory XKCD.

(The town where I live is home to a credit union catering to the substantial French Canadian population of the area. Everyone with a public-facing role is bilingual. Employees using English during the work day: 100%. Employees using French during the work day: 100%. My word, where did all these extra employees come from?)

Make sure your numbers are related to any numbers nearby. If not, put them in separate boxes, or on separate pages, or in separate sections, or something to indicate “these two things are independent of one another.”

Making the Measurement the Goal

One thing organizations like to do is measure themselves. As long as the measurements are treated as descriptive — “this is what we are” — this is innocuous, or even useful. But once the measurements are thought of as normative — “this is what we should be” — this is dangerous.

I know two companies that put themselves out of business with bad metrics. The first was a retail chain; management incentivized buyers based on their inventory turns per year, which is actually not a bad thing to do. But the chain took physical inventory in early January only. So inventory turns were measured as department sales for the year, divided by inventory total retail value on a specific date in January. The buyers quickly realized that the easiest way to get wonderful inventory turns and a huge bonus was to quit buying around October. You know what is a major feature of the retail year, and the part that makes retail profitable? The Christmas Rush, also known an December. So the buyers made sure the stores were out of stock of everything during the Christmas Rush. It took three years for the chain to go under — it was a large chain — with the senior executives bewildered the whole way as to how they could possibly be losing money when the buyers were all earning huge bonuses.

The other company was a manufacturer. The CEO got the “Key Performance Indicator” religion, and assigned KPIs to all the departments. The CEO did this by making all the departments responsible for whatever numbers they reported. Now, the “warranty costs” number was reported by the Service department, because they were the ones who spent the money to fix warranty issues, so they were the ones who knew what warranty costs were. Under the “you smelt it, you dealt it” management model, Service was strongly charged to get their warranty costs under control! But you may notice that a warranty cost comes from not delivering the product the customer bought. Warranty costs have four major sources: engineering defects, manufacturing defects, parts defects, and shipping damage, exactly none of which the Service department controls or can even affect. Engineering chose the part with a four connection cycle lifetime when the product needs 500 connection cycles? Service, get it in gear. Manufacturing forgot to install the fenders on a run of 500 units? Service, get a grip. Supply Chain got the cheap fasteners that break? Service, pick it up. Shipping went with Abandon All Hope Transport? Service, what is wrong with you? The company punished Service into oblivion, and surprise! It turns out Service had been their only profitable department. That one only took a year. The company name is now a brand of another company.

Believing the messenger is somehow responsible for the message is human cognitive glitch. This is not helped by the business habit of regarding managers as successful for externalizing costs to make their own budget look good. Philip Crosby, in Quality Is Free, makes the point that a Quality department is not, in fact, in charge of quality: it is in charge of telling you about quality. The rest of the organization is in charge of quality itself. Organizations do not have quality problems: they have manufacturing problems, or engineering problems, or packaging problems, and so on. But fixing the blame, instead of fixing the problem, remains a popular pasttime. About the only thing you, as a practitioner, can do is to make sure the numbers you report also ascribe root causes. Maybe someone will clue in that Service is not in fact involved in making Manufacturing Defects.

Six Lines Written By the Most Honest of Men

“To learn who rules over you, simply find out who you are not allowed to criticize” is an aphorism commonly, and incorrectly, attributed to Voltaire. In any organization, there are those who are above criticism. And a good many of such persons will perceive criticism when even a question as to organizational goals and values arises when they have already chosen an answer.

I once worked for a major computer manufacturer, which had a Software Quality Program Office charged with finding out why corrective maintenance (bug fixing) cost so much, and fixing it. Over several years, the SQPO defined what was a software quality problem (high defect repair costs per customer year) and investigated development process causes (because the process, not the people, determines the properties of the process output). They found, unsurprisingly, schedule pressure to be the root cause of software quality problems: the products that slipped their delivery dates when the engineers said “not yet” had much smaller defect repair costs than those that had not slipped their delivery dates under the same circumstances. They even came up with a model of how much of one caused how much of the other: how much schedule compression cost how much in later repair costs. This made the schedule/quality trade off one that could be made as a business decision: the business is, or is not, willing to pay $x later to get product ship two quarters earlier. Remember how managing to the feature checklist and the calendar gets the worst possible software? This was that with numbers and error bars. For this frankly quite spectacular achievement, the SQPO was promptly reorganized out of existence. Apparently, the executives had been recognizing and rewarding management for quick product ship, and the winners felt the existence of this trade off called their worth to the organization into question.

The iron triangle — “good, fast, cheap: pick two” — had claimed more victims. The computer manufacturer is long out of business now, but the lesson remains: be very wary of anything that might possibly be misunderstood as criticism.

Oh The Humanity

Now, the point of this parade of horror stories is not merely to have a parade of horror stories, or to garner sympathy, or to say management is or organizations are stupid. Management is not stupid. Organizations are not stupid. They are human. “A person is smart. People are dumb, panicky, dangerous animals and you know it.” People are often a little shaky on cause and effect. People find it difficult to see the actual result of an action because they want so much to see the intended result of the action. Messengers, unless very careful, get shot. You must tread softly and carefully when presenting information that goes against the organizational culture, or that can be interpreted by someone in power as “you are wrong.” Organizations are as much social entities as functional entities, and typically far more tribal than rational. All this is commonly termed “organizational politics,” and to be effective in your communications you must take it into account.

Actual Numbers

Any number purporting to be a measurement has an extremely important attribute: exactly what it is purporting to measure. Production per day by shift measures exactly that, not production per shift by shift. Misunderstanding that is how you get vice presidents very angry third shift slacks off — by 10% to 15% of production! — the day before a holiday. (Hint: the day before a holiday, third shift has 7 hours in the calendar day, midnight to 7 AM. It doesn’t do the 11 PM to midnight part 16 hours later. That’s the start of their holiday.)

And numbers — all numbers, including numbers purporting to be measurements — have two attributes that are different yet related. These attributes are their accuracy — the difference between reality and the number intending to represent reality — and precision — the difference between one number and the next potential number. Precision is also related to apparent precision, which relates to how the number is represented.

Let’s talk about the simple one first: apparent precision. A work week has 40 hours. If I allocate or record my work time by hours, then I am allocating or recording in units of 2.5% of a work week. That is simply a fact, and has no real implications. But if I report my time that way, it appears to be measuring in tenths of a percent of a work week. A tenth of a percent of 40 hours is 2 minutes, 24 seconds. If I say “I spent 2.5% of my time on refactoring the RotaryFramistat class,” a great many people, many of them perfectly reasonable people, hear ”I record my time to the nearest two and a half minutes.” Which is both absurd and not true. The number that comes after 2.5% is 5%, not 2.6%; but there is no reasonable way to make that evident. (I could say “2.5 ± 1.25 %,” which is all of accurate, compact, and sciency-looking, but that strikes me as a cure worse than the disease. Do you really want to try to explain error bars to a people manager?)

Some people are simply functionally illiterate: they can read a menu and a warning sign, and even write something simple, but they just don’t “get” written information. These functionally illiterate people are just as successful as everyone else: they can support themselves, get dressed in the morning, even run businesses. They just don’t become writers or editors; I know one who became a vice president of a very large company. Some people are simply functionally innumerate: they might be able to balance a checkbook, but “Monday and Friday are 40% of the work week” will forever be a mystery to them. These people are also just as successful as everyone else. They just don’t become accountants or engineers; I know one who became a director at a manufacturing company. Illiteracy and innumeracy are not moral failings, or even major impediments to a happy and productive life: these are simply facts about people, like handedness or height. We need to present information so the audience understands, if we want the information to be used. We need to adjust to facts about people as they are, not as we want them to be.

Some numbers are infinitely precise. An US inch is 2.54 centimeters. But that does not mean somewhere between 2.535 and 2.545 centimeters: it means 2.54 centimeters, exactly. That’s the definition of a US inch. Measurements, however, are typically neither infinitely precise nor infinitely accurate. Except counts: counts can be inaccurate, but rarely imprecise. I can miscount and say there are 15 widgets when there are really only 14, but “about 10 to 20 widgets” is not a count, it is an estimate.

A really good measurement will typically be more precise than accurate. GPS coordinates are precise to about an atomic diameter; they are accurate to about a centimeter (best case). (And if you check them against an official map, you learn that official maps are deliberately and systematically wrong: the “state grid” is a projection onto a rectangular coordinate system based on a cylinder, which is not how the Earth is actually shaped. Surveying is fun.) I can measure a board an learn that it is 8 feet, 3/32 inches long. That’s very precise, but the accuracy of the measurement depends on the accuracy of the tape measure I used, and the skill with which I used it. This is why in woodworking you dry fit before gluing up.

Sometimes measurements are very precise and wildly inaccurate. The Gimli Glider incident, for example, occurred because the fuel load of an aircraft was rather precisely measured as a certain number of pounds — but it was actually that number of kilograms. The Mars Polar Orbiter was lost because the spacecraft commanded a quite precise amount of thrust required — in pounds — and the engines gave the quite precise amount of thrust — in Newtons.

Different professions handle difficulty with numbers in different ways. Let’s take time, for example, because time is easily measured. Attorneys typically record billable hours in tenths: units of 6 minutes. I do not envy attorneys their billable hour bookkeeping. Mental health professionals typically have 50 minute hours: they need 10 minutes between clients to chart and decompress. Construction crews typically schedule in half-days, but the before lunch half day is an hour shorter than the after-lunch half day. Manufacturing organizations typically schedule in two-hour blocks, which due to breaks are actually not two hours. There is no good solution or universal practice in time reporting. And there is no good solution to the general issue of accuracy and precision.

The usual way to reduce apparent precision to something sensible — such as the actual accuracy of the number, or a contextually meaningful magnitude of the number — is rounding. In rounding, you pick a unit — say, millions of dollars if we are talking financials in a large organization — and report in those units. This introduces deliberate inaccuracy in the reported numbers to suppress irrelevant details. The accountants might want to debate whether it was actually $715,203,997.14 or $715,203,997.12 — which cost center consumed that pencil? — but the board and executives are fine with 715 ($ million).

But rounding also has a pitfall: it leads to footnotes saying “details do not add to totals due to rounding.” Rounding to the nearest million dollars means each number in the table is wrong by up to $500,000. Which means that if you add them up, the sum is wrong by up to $500,000 times the number of elements in the table. Which means that the correctly rounded sum is not going to be the sum of the correctly rounded details. Now, rounding is one of those things covered in fourth grade, and you even get a refresher in middle school. But no one remembers any of that. And a great many people have managed to avoid being reminded of this fact about rounded numbers — they are approximate, not exact — for their entire adult lives. And these people will point out your percentages don’t add up to 100, or the detail lines don’t add up to the total. And these people simply will not understand your explanations that the numbers are not supposed to add up to 100 or the total shown, but to something near that. And they will bring this up every time they see the situation. And eventually you have to adapt. Bar charts and pie charts are good.