tldr; saw a Windows Calculator bug on reddit. Since calc.exe was open-sourced I thought I’d try to find the bug and fix it. Cloned the code, recreated the bug, and found a minimal fix.
I was goofing-off browsing Reddit, as you do on a Sunday. Scrolling past playing puppies and bad programmer humor, one particular post caught my eye.
It was about a bug in calc.exe. “Well that looks like a curious bug, I wonder what could cause it”, I though to myself. The large number of weeks certainly makes it look like some under-/overflow issue or some off-by-one error, you know, the typical causes. But it could always be some flipped bit by some high-energy ray from some friendly cosmic neighbor.
So being curious about the cause, I did what you do in these cases: try it on your machine in giddy anticipation of posting “works for me”. Actually, the exact case in the post “July 31st – December 31st” did actually give the correct result of “5 months” on my machine. But testing around a bit I found that on my machine “July 31st – December 30th” actually causes the bug. Showing the not quite so correct value of “5 months, 613566756 weeks, 3 days”.
I was not done slacking-off and I remembered “Oh, Isn’t calculator one of those things that Microsoft open-sourced?”. It is indeed. This bug couldn’t be this complex, so I thought I’d try and see if I could locate it. Downloading the source was easy enough and adding the required UWP workload to Visual Studio went off without a hitch too.
Navigating code-bases you’re not familiar with is something you get used to after a while. Especially when you like to contribute to open-source projects when you find bugs. But not even being familiar with XAML or WinRT certainly doesn’t make it trivial.
I opened the solution file and looked into the “Calculator” project, searching for any file that seemed related. Found the DateCalculator.xaml
and traced the relevant sounding DateDiff_FromDate
to DateCalculatorViewModel.cpp
and finally DateCalculator.cpp
.
While setting a breakpoint and looking at some variables I saw that the value of the final DateDifference
was already wrong. So it was not just an error in the conversion to the string, but the actual calculation that was bugged.
The actual calculation looks something like this in simplified pseudo-code:
DateDifference calculate_difference(start_date, end_date) {
uint[] diff_types = [year, month, week, day]
uint[] typical_days_in_type = [365, 31, 7, 1]
uint[] calculated_difference = [0, 0, 0, 0]
date temp_pivot_date
date pivot_date = start_date
uint days_diff = calculate_days_difference(start_date, end_date)
for(type in differenceTypes) {
temp_pivot_date = pivot_date
uint current_guess = days_diff /typicalDaysInType[type]
if(current_guess !=0)
pivot_date = advance_date_by(pivot_date, type, current_guess)
int diff_remaining
bool best_guess_hit = false
do{
diff_remaining = calculate_days_difference(pivot_date, end_date)
if(diff_remaining < 0) {
// pivotDate has gone over the end date; start from the beginning of this unit
current_guess = current_guess - 1
pivot_date = temp_pivot_date
pivot_date = advance_date_by(pivot_date, type, current_guess)
best_guess_hit = true
} else if(diff_remaining > 0) {
// pivot_date is still below the end date
if(best_guess_hit)
break;
current_guess = current_guess + 1
pivot_date = advance_date_by(pivot_date, type, 1)
}
} while(diff_remaining!=0)
temp_pivot_date = advance_date_by(temp_pivot_date, type, current_guess)
pivot_date = temp_pivot_date
calculated_difference[type] = current_guess
days_diff = calculate_days_difference(pivot_date, end_date)
}
calculcated_difference[day] = days_diff
return calculcated_difference
}
Looked good to me. Didn’t see an issue in the logic. It’s basically:
- from the start day step in years towards the end, count years
- from the last year before the end step in months towards the end, count months
- from the last month before the end step in weeks towards the end, count weeks
- set the remaining days from the week before the end
The issue is actually the assumption that running date = advance_date_by(date, month, somenumber)
date = advance_date_by(date, month, 1)
one after the other is equal todate = advance_date_by(date, month, somenumber + 1)
which it typically is. But the question is: “If you are at the 31st of a month and you increment to a month that only has 30 days, where do you land?”.
The answer for Windows.Globalization.Calendar.AddMonths(Int32) is apparently “on the 30th”.
This means specifically:
“July 31st + 4 Months = November 30th” and
“November 30th + 1 Month = December 30th”.
However
“July 31st + 5 Months = December 31st”.
So the operation AddMonths
is neither distributive (with AddMonth being the multiply) nor commutative or associative. As you would intuitively assume of an “addition” operation. Isn’t it fun working with time and calendars?
Anyway, the reason why this seemingly off-by-one value results in such a huge number is as you might suspect that then days_diff
being an unsigned type. It makes the -1 days into a huge number which then gets passed onto the following loop iteration with weeks. Which then tries to correct current_guess
downwards. But it tries to correct it once again by decrementing an unsigned variable.
Well this was fun way to spend a part of my weekend. I created a Pull Request on Github with a minimal “fix”. I put fix in quotes because it then results in the following result:
Which I guess is technically correct if you concede that “July 31st + 4 Months = November 30th”. But the complete result does not really mesh with a humans intuition of date-difference. But it’s less wrong in any case.
Wouldn’t it be possible to just post format a date difference? That is, if it has > 30/31 days, 4 weeks, or 12 months, then “carry” it?
If you have 29 days, is that:
1 month 1 day, 1 month, or 4 weeks 1 day? Answer: depends on the month.
Honestly, for the initial example, I think the “correct” solution is don’t use uints and let the answer be “5 months minus a day”.
So sad that the calculator doesn’t have unit tests at all
I think not only does it have Unit Tests: https://github.com/microsoft/calculator/blob/9f01c81/src/CalculatorUnitTests/DateCalculatorUnitTests.cpp
But it also has UIAutomation Tests. So I think it is more well tested than most Software I’ve dealt with. It’s just a case of a non-obvious failure case. I think this is the type of bug to be more likely caught by fuzz-testing or something like it.
See in the UNIX world you’d convert both dates to time_t’s, do a simple subtraction and then convert back to seconds/minutes/hours/days/weeks/months/years.
Enno: How many time_t’s are there in a month?
That works if you only want to express the difference in seconds, minutes, hours, day, and weeks. You still can’t reliably express a difference between two dates in months and years because the length of these depends on which months and years you’re talking about, something which a pure difference throws away.
time_t also exists on other platforms, but it might be too limited for what they’re doing, and furthermore you just shift the possibility of introducing a bug from one implementation to another (the stdlib). And yes, the gnu stdlib that you will typically find there has bugs too (I reported one some years ago).
Will this work correctly for various leaps? Like dst changes? And changing of dst methods?
I face the same issue quite frequently in my industry where everything is reported as of end of month.
The solve the problem the easiest, is to first add day, do your calculation, then subtract a day.
I suggest the same approach for the calculator, it will feel the most “natural” to humans
So, add a day to start and end, calculate the difference (no bug fix required)
I’ve dealt with programs that involved calculations related to number of days between dates. I’ve also written routines to do the calculation. The approach taken by calc.exe is too simplistic to work reliably as evidenced by the reported bug. The program can start by counting the differences in years first. Once it gets down to months the program needs to know the number of days in each month and whether there are any leap years involved in order to properly calculate the number of days/weeks/months/years between two dates. That might be a slightly slower method but it would be give reliable results. Considering the use case, the speed of the method used to do the calculation would not be a problem.
Seems worth testing for the special case before the calculation:
if ((day_of_month(start_date) == 31) && (day_of_month(end_date) == 30)) day_of_month(start_date) = 30;
After that the original code should be fine.
I wonder, why this is not using Julian Day: https://en.wikipedia.org/wiki/Julian_day
Was just reading about it in “Celestial Calculations” by J. L. Lawrence. See Chapter 3.
Also, C++11 has good support for dates/times, see https://stackoverflow.com/questions/14218894/number-of-days-between-two-dates-c
Surely the correct answer is how many calendar months, weeks, and days there are between the two specific dates. So you could just iterate the date incrementing month until you’re on the same month, then increment week until you’re on the same week, then finally days.
Seems to me it would be easier to just use a few if statements and some subtraction…
dateDiff Function:
//Add code before here to first make sure startDate is in fact before endDate. If not, then swap them.
yearsDiff = endYear – startYear;
if (endMonth > startMonth) {
monthsDiff = endMonth – startMonth;
} else {
yearsDiff–;//Take off the partial year we counted.
monthsDiff = endMonth + 12 – startMonth
}
if (endDay > startDay) {
daysDiff = endDay – startDay;
} else {
daysDiff = daysInPreviousMonth(endMonth, endYear) – startDay + endDay;
monthsDiff–;//Take off the partial month we counted.
}
weeksDiff = 0;
while (daysDiff > 7) {
weeksDiff++;
daysDiff -= 7;
}
Anyone want to code this pseudo up and test it? lol
Are there any errors or special cases this one misses?
Just be careful in coding daysInMonth to account for leap years.
daysInPreviousMonth Function:
return daysInMonth((monthPassedIn + 12 – 1) % 12, currentYear);
daysInMonth Function:
usualDaysInMonth = [31, 28, …];
daysInMonth = usualDaysInMonth[monthPassedIn];
if (monthPassedIn == 2 && isLeapYear(yearPassedIn)){
daysInMonth++;
}
return daysInMonth;
isLeapYear Function:
leapYear = false;
if (yearPassedIn % 4 == 0) {
leapYear = true;
if (yearPassedIn % 100 == 0) {
leapYear = false;
if (yearPassedIn % 400 == 0) {
leapYear = true;
}
}
//Code for any exceptions to the rule here. (AFAIK none yet, but calendars and rules can change.)
return leapYear;
The code in the isLeapYear function is written for clarity, but it could be reduced to a single line encoding all rules.
If you’ve read this far, thanks. I welcome any feedback and would love to hear if anyone uses this pseud-code.
It might seem an error to pass in the currentYear without checking if it rolled back a year in the daysInPreviousMonth function, but it doesn’t affect the result as the only code using the year passed into daysInMonth is the isLeapYear check and for that case it only applies if the month passed in is 2 (February) in which case it is impossible for the year to have rolled back.
Ahem, sorry if I confused anyone. That line should say yearPassedIn, not currentYear. I made some last minute changes leading to this typo.