Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Please reconsider the Boolean evaluation of midnight (python.org)
337 points by rivert on March 6, 2014 | hide | past | favorite | 208 comments


So ignoring the hype, here's the outcome-to-date...

The ticket was reconsidered, reopened and classified as a bug. http://bugs.python.org/msg212771

Nick Coghlan's dissection of the issue here: https://mail.python.org/pipermail/python-ideas/2014-March/02... is pretty much perfect - wonderful piece of technical writing!

Donald Stufft has expressed an interest in making the patch for this happen, and assuming all goes as planned this usage will raise a deprecation warning in 3.5 and be fully fixed in 3.6.

News in brief: User raises issue. Issue gets resolved.


Neat! This stuff actually works sometimes!


Will it be resolved in 2.x though? That's what most people care about.


2.x is EOL and only remains for legacy programs. A backwards-incompatible change won't even be considered for this branch.


> That's what most people care about.

I'll make an equally aggressive counter point: Dammit guys and gals, stop using 2.x. Python 3 came out more than 6 years ago. You are holding back and splitting the community. And if you tell me 3.x has no perceived value then this is the value that is slowly stacking up.


My guess is that eventually someone with enough money on the line will be sufficiently harmed by the Python 3 nonsense to maintain 2.x, at which point obvious bugfixes like this will be "ported" (it's probably a 2-line change).


Wow. This "Mark Lawrence" guy is absolutely worthless. I googled him and cannot believe the Python community still has guys like him.


I took a look and yes, this is the kind of guy that results in a system with a hundred weird exceptions that have to be memorized.


If I understand the argument there correctly, the responder is saying: Nobody should ever use this functionality, instead they should always check that the date is not None. So, we should leave this broken, because we don't want to break backwards-compatibility with that class of applications that nobody should ever write.

That philosophy, taken to its logical conclusion, results in everything being broken forever.


This is what is plaguing the ocaml compiler and standard library. Overconservatism with respect to obscure features. For example, a patch was written to speed up hashtables (IIRC), and was rejected because it would change the result of hashing format strings (type-safe format strings have a different type from plain strings).

I can't imagine a single application where you would need to preserve the value of hashed format strings. But specifying their value so that you can rely on them just seems to be a bad idea.


Python, to its credit, has been making changes of that level.

First, True and False were just the integers 1 and 0. Then they were given special values that numerically evaluated to 1 and 0, but __str__ now emitted 'True' or 'False'. They were still global variables, though, so they could be reassigned ('True, False = False, True') and cost a dictionary lookup to use. Finally, with Python 3, they (along with None) were made keywords so that such shenanigans could be stopped.


It does sound like they might have gone a bit far, but as someone who works in an industrial context, a tool that maintains backwards compatibility rigorously is of enormous benefit.

The case you might have to make to upgrade the version of the tool you are using has to take into account the risks, and something that is maintaining a high level of backwards compatibility has a much lower level of risk.


Here are my preferences for tools, in order from most-preferred to least preferred: first, tools that have a rigorous deprecation process for breaking changes, distant second, tools that are incredibly conservative about breaking changes, pretty close third, tools that refuse to make breaking changes at all ever, astronomically distant last, tools that make breaking changes willy-nilly.

I generally eventually have to re-write to use a totally different tool for either of the middle two cases, because they just can't keep up. I would much rather make changes to my use of them based on deprecations than be forced to ditch them entirely.

(This is more a general statement - obviously refusing to make this datetime change isn't realistically going to push anybody away from Python.)


Type-safe format strings should never be mixed with plain strings, constant strings, byte strings, unicode strings, batteries strings, byte array strings, char array strings, string buffers or any other type of string supported by ocaml. Keep your types pure and never mix them with any other types. Never. And you will never have any problem. And these python developers are lunatics. It is preposterous to use 'if' condition on a Date object and expect anything good. Idiots. They should have at least written a type-safe wrapper.


Indeed, the first thing that came into my mind when reading this was "You wouldn't have this problem if the only valid type in a conditional was boolean."


It's against the grain to think this way in dynamic languages because variables don't have types, only individual values do. Ill-thought truthiness rules are the real problem. IMO every value except boolean false (and, if you insist, nil/null) should be truthy in a dynamic language.


> It's against the grain to think this way in dynamic languages because variables don't have types, only individual values do.

So? "The only valid type for a conditional is boolean" still works if types only apply to values. Under that principle, anything but True or False value encountered in evaluating the condition of an if statement ought to throw a TypeError, not be evaluated for truthiness. If you are expecting something else, call an explicit, use-case-appropriate function to get the right in-context truth value.

(Note, I'm not saying Python should do this, I'm explaining how the logic applies to Python without any contradiction to the "variables don't have types, values do" principle.)

> IMO every value except boolean false (and, if you insist, nil/null) should be truthy in a dynamic language.

I think that's better than what Python does (Ruby does that by default, with nil included as false, though it is IIRC possible-but-extremely-strongly-discouraged to override the default truthiness of objects so you could have classes with falsey values), and I lean toward preferring that approach, but the idea that a dynamic language would do well to just allow True and False as the only valid (non-error-producing) values for an "if" statement is not, IMO, without some merit.


We're on the same page - I just didn't phrase that well. I mentioned it because most of the time what's written is a test of a variable, not a literal value, and that couldn't get a thumbs-up as robust without someone putting a static analyzer hat on.

Having a simple universal truthiness rule seems preferable and more in the spirit of a dynamic language.


> variables don't have types, only individual values do.

It really depends on what language you're using. This is more or less true in Python, Ruby, and Javascript. In Lisp generic methods you can specify the type for the input variables and be guaranteed that if you are inside the method the parameters are of that specific type. For optimization reasons you can also declare to the compiler that variables are a specific type. This is really important when doing numeric computations in a tight loop. I've had 50% - 80% performance improvement by declaring types (amounting to hours of run time). I think Groovy and Clojure also allow this but I don't know much about those two.

There's also the maintenance issue though. After a few years of maintaining a fairly large Lisp project I've become pretty convinced that the only way to stay sane, at least for me, is to treat it as a statically typed language. I have asserts and check-type macros all over the place. If variables may have multiple types they can be checked against '(or type1 type2) but things do become more complicated then.


Interesting. Admittedly I have only used (UnCommon) Lisps in a hobbyist way and haven't been concerned with techniques for extra speed/robustness there.


The first thing that came into my mind when reading this was "You are a fool of a great intelligence."


That latter's debatable but probably not the former unfortunately.


I can't tell if this is sarcasm.


Similar issues plague Haskell. A bunch of functions have illogical names, just because that's how they've always been used. And, there's the annoying matter of Applicative not being a superclass of Monad in the language, when it mathematically should be. This should be fixed in Haskell 2014, though.


Yes. This is where Linus would step in and say "SHUT THE FUCK UP! We don't break user space, EVER!" Developers who enter into the mindset that the users of their code are using it "wrong" are a blight and will always be a blight. Too bad Python has a BDFL and not a Raging DFL.


Some things are really hard on users when they break. Kernels are especially hard, because only one can run on a (non-virtualized) system at a time. System libcs are pretty hard, ditto init systems and some other deep infrastructure.

Language interpreters can be annoying, but not to the same degree. It is quite easy to run multiple Python versions on one system. It's also relatively easy to find and fix the broken Python code in a case like this -- it's always sitting there on your drive, open for inspection. And it's not black magic -- anyone with a little programming experience, even if it's not in Python, can easily understand the issue and the fix, and make the appropriate change.

Nuance matters, and unlike a boolean, principles are not always binary choices.


Linus does NOT break backwards compatibility. That is what this change implies.

Which doesn't mean I disagree with changing the behaviour, but I doubt Linus would use that reason for that.


It's undocummented, and very complex to use. That mail does not even have sufficient information to understand how to use this "feature". I hightly doubt anybody uses it outside of the datetime codebase.

I never saw Linus complaining about changing undocummented and not used behaviour.


It is documented. That's the strongest argument against changing it: it's a documented piece of user-facing behavior. The python developers are changing it anyways because they're practical people who have decided it is more likely that people wrote bugs into their code, than that they wrote code that depended correctly on the falsiness of UTC midnight.


This is true, but if it is documented it is still sufficiently oddball to be useless. Consider:

    Midnight UTC, timezone EST:  True
    Midnight UTC, timezone CET:  False
    Midnight UTC, timezone CST:  True
    Midnight UTC, timezone WIT:  False
    Midnight UTC, timezone PST:  True
Now the problem here is that it may well be documented but it is in fact unsafe to rely on in any context. The behavior is thus irretrievably broken, documented or not. It isn't even expert-friendly. The behavior, documented or not, is irretrievably broken. After all timezones west of GMT will never have false times.

If there is a use case for this behavior, I sure can't see it. Unless you like bugs appearing when you change timezones.


As another commenter pointed out, nuance is important. However, there are times to break break compatibility and accept that people will struggle. The way to do this right is to give real warning and provide reasonable migration paths.

I will give you an example that bit my project hard, caused hours upon hours of wasted time for bugfixes, broke our software, etc. and yet was a good thing: PostgreSQL 8.3's removal of most implicit type casts. Prior to PostgreSQL 8.3, the following query would be valid and evaluate to false:

     select '2014-01-01' = 2014-01-01;
What would happen is the system would look for possible types to compare and find that text matched. It would then treat this as:

     SELECT '2014-01-01'::text = (2014 - 1 - 1)::text;
And further to:

     SELECT '2014-01-01'::text = '2012'::text
which was not what you meant. The team made the right decision to make this change and subsequently broke a lot of apps out there. I think it took us a couple months to be sure we had all the breakage out of LedgerSMB 1.2.x. Painful change. Broke a whole bunch of apps. But it was an important one and we are all better off for it.

Now, Linux exists in an ecosystem of mostly source-compatible operating systems. People write portable software that they expect to run on FreeBSD, Linux, and so forth with minimal tweaking. I am not sure "we don't break user space ever" would be a compelling argument against fixing non-POSIX-compliant but existing and documented behavior. But this scenario (and frankly the stupidity in the midnight type handling in Python) is what pre-review is supposed to help prevent in the first place.


Given approximately zero people using this datetime behaviour correctly, maybe one or two who tried to use it correctly and failed, and thousands of people who have bugs because they didn't expect it, I'm not sure changing it to be sensible counts as breaking user space in any practical sense...


This is one of the pet theories I've seen for why Lisp didn't become more popular - Common Lisp was standardized too early and thus nothing could ever be fixed.


My understanding is that the responder is saying that "That test does not mean what you are intending it to mean, it means something else that we've documented. We won't make it mean what you would like it to take make your incorrect code work, at the expense of breaking code that is using it in the documented way".

I started on the side of the original person but was persuaded round by the responder.

"if x:" means something completely different than "if x is not None", adn that needs to be understood. The fact that they might evaluate to the same result a lot of the time is luck, and is how people have got into this mess. Changing the boolean evaluation in this case is actually making the mess even worse.


One of the most basic pages of python documentation [1] says this:

    Any object can be tested for truth value, for use in an if or while 
    condition or as operand of the Boolean operations below. 
    The following values are considered false:

        None

        False

        zero of any numeric type, for example, 0, 0L, 0.0, 0j.

        any empty sequence, for example, '', (), [].

        any empty mapping, for example, {}.

        instances of user-defined classes, if the class defines a __nonzero__() 
        or __len__() method, when that method returns the integer zero or bool value False. [1]

    All other values are considered true — so objects of many types are always true.
To me, that makes it clear that no valid time could be False. It isn't on that list, so it should be True. Is there any controversy about that?

[1] http://docs.python.org/2/library/stdtypes.html


It's documented in the datetime module [1]:

    in Boolean contexts, a time object is considered to be true if and only if, after converting it to minutes and subtracting utcoffset() (or 0 if that’s None), the result is non-zero.
But the only reason I know that is because I read a substantial part of the exchange where Mr. Paul Moore quotes it.

My initial reaction was to think "How the heck would you deduce midnight from this?" and I think arguing that the behavior is fully documented from this line alone is a bit of a stretch. It's true that it is documented, but not in a manner that is immediately obvious. Worse, there are 87 instances of "None" on that page, which makes searching for a specific issue somewhat daunting.

The irony (in terms of an unexpected side effect) is that by having the discussion they did, even if nothing comes of it in terms of fixes or changes, future developers bit by this behavior will be able to readily find it via a search for midnight, None values from time objects, etc.

[1] http://docs.python.org/3.4/library/datetime.html#time-object...

Edit: Left off the citation. Sorry. ;)


For the purposes of that document, datetime is not a builtin type but a user-defined class (since it's written in a Python package and not the interpreter directly). It has a __nonzero__ method defined.


You're correct, it falls under that last rule of user-defined class. They certainly have a right to define it that way, in some sense.

But it just goes entirely against the spirit of that documentation to do so. I would be fairly shocked if there are many other examples of exceptions to this rule in the standard library. That's just not how it is supposed to work. I'm primarily a python developer, and I have that list of False things very deeply internalized. They are False, other things are True. I'm sure most others devs have as well.

It is right at the top of the page on the documentation of the standard types.

I'm sure there could other classes where the truthiness had some obvious physical meaning, and objects could be either True or False in a meaninful way. But midnight is not one of those.


I agree.

The issue I take with this behavior, or at least the justification for closing discussion of whether or not it's applicable (though I do agree it's not a "bug" per se, simply on the merit that it's documented--even if the documentation is unclear), is this notion that it would break code where this behavior is relied upon. First, you're assuming that someone understands the behavior clearly enough to exploit it (while there's obviously a non-zero population who aren't aware enough to avoid it). Second, you're assuming that they're not inclined to realize what an outrageously stupid idea it is to rely on midnight == False. If someone who's attempting to use truthiness as a means of determining if a time object (say, from a database) is None or falsey isn't aware of this behavior enough to not get bitten by it, what makes us think that the percentage of people who would be exploiting this behavior is somewhat higher? It's insanity. More so when each of the examples presented in the discussion for how it might be used are outrageous edge cases.

Now, would it be possible to workaround this by using datetimes, which as far as I can tell cannot be made falsey, then extract the time component later when needed after validating that the datetime is indeed not None? I can't think of any real world circumstances where you're not going to need to be aware of the date, timezone, and therefore DST when extracting times except for profiling or one-off quicky applications that don't need TZ awareness.


> Changing the boolean evaluation in this case is actually making the mess even worse.

I don't see how that's possible. Assuming that using "if x:" is wrong (for some value of wrong) here, this issue only affects people who are using it wrong anyway. So all fixing it does is make the consequences of doing bad stuff less painful. I can't see how that would hurt things. Even if you think "if x:" should never be used on non-booleans, leaving land mines in there to hurt newbie (or lazy) developers is not a good way to enforce that convention. It just creates pain.

Or more simply: whether having boolean coercion for conditionals is a good idea is orthogonal to how such a coercion should work. It sounds like Python should deprecate truthiness altogether. But it hasn't, so in the meantime truthiness should work in a reasonable way, and falsy midnights are not reasonable.

Pointing out that it's documented is unhelpful (the documentation is just the wrongness restated in a different language), as is "That test does not mean what you are intending it to mean" (Tautological. The OP was suggesting changing what the test means).


You can't document your way out of a usability problem.


This is a nice way of putting things. There's a school of thought that says if things work how it says somewhere that they should work, then no part of the system is exhibiting a problem. But just in this thread we can see Python's insane treatment of default function parameters called a "bug in the language". I've long been mad about what I can only think of as a bug in the Java spec: bit shifts use the base-32 modulus of their actual operand. Sure, the specification says that that's what they're supposed to do, but when I ask the JVM to fill a 32-bit value with 40 zeroes from the right, I expect one of the only two sane results -- an exception or whatever value 32 zero bits represents. I'm pretty sure there is no circumstance where shifting in 8 zeroes could be correct or useful.


There's a lot of maintainers who seem convinced otherwise.


why is there anything in this thread besides your post is beyond me.


I understand the distinction between the two ways of writing the conditional. The point is that the documented way is surprising, and as far as I can tell everyone agrees that it is wrong. It's really not a matter of making the original poster's code work, it's a matter of making the language constructs have the expected behavior. Everyone seems to agree that code that used this behavior in the documented way would be bad code. That being the case, I see no way in which changing the behavior to the expected behavior would make anything worse.


Equivalently, taken to its logical conclusion, it results in more people writing better code.


Would you intentionally introduce language features that break functionality for people who don't follow style guidelines? I certainly hope not.

The real problem with your argument is that the poorly-styled code of people who haven't learned better yet is in production. This isn't a theoretical exercise. This is a tool used in industry, and "teaching people to style their code better" is not a valid excuse for costing industrial users of the language money.

Further, Python markets itself as a beginner-friendly language. Your attitude is the exact opposite of beginner-friendly.


>Would you intentionally introduce language features that break functionality for people who don't follow style guidelines?

Uhm maybe I'm confused but isn't that the whole design philosophy behind python? It makes bad indentation a syntax error!


>It makes bad indentation a syntax error!

No, its syntax for determining block structure is indentation. The designers didn't say "Hey lets complain about bad indentation" they said "Hey, people should be indenting their code properly anyway, so why not make that the syntax?"

Also, they didn't introduce it to an existing project, it was there from the beginning (AFAIK).


beginner-friendly doesn't mean stupid-friendly. And I am in favor of "teaching people to style their code better" in the language even if in this case, it's not about style but the meaning of midnight. semantic != style. Also, language must also help/structure/offer rails for thinking, create et cetera.

Industry must bend before truth, not bend truth over itself...


We have mostly the same principles but have come to different conclusions. The semantics of midnight are indeed what is important here, but midnight does not qualify as a zero. Zero implies identity for some addition, and there is no addition for which midnight is the identity. This is because midnight does not even support addition: you have to use timedeltas if you want to add times. Timedelta 0, then, is the proper zero, and should semantically be falsy in languages where zeros are falsy. Midnight, on the other hand, should not be. (The entire notion of falsiness is terrible and should crawl off somewhere and die, but that's beside the point.)


Agreed, midnight is not False.


    if var:
        ...
is better code (for my value of better, which is subjective) than:

    if var is not None:
        ...


In my opinion, the "explicit is better than implicit" mantra of Python (which I find very sensible) should immediately imply preference for the second, even in cases where the first is almost certainly not going to cause problems.

Testing for "not None" also immediately tells other devs at least something about what's going on. Example: If var is an argument to a function and you see "if var," really understanding what's going on means looking at all invocations of that function to see what's being passed in.

Further, in that scenario, it's possible that different types are being passed for that argument. While "if var" could make this work where "if var is not None" would break functionality, in my opinion, it should break. Without getting into arguments about static typing, "truthiness" allows both sloppy code and the accidental introduction of bugs that might otherwise be caught during development due to faulty logic from vague conditional checks.


If you shouldn't use non-booleans in an if statement, then why is it allowed? If it's allowed, it should behave in the way that would surprise developers the least.

http://en.wikipedia.org/wiki/Principle_of_least_astonishment


So since we need to be explicit, it should be:

  if var in (None, 0, "", [], {}, ...):
    pass
something like that?

In general, most of those false-y values make sense. Midnight being a false-y value does not make sense. This seems to be a case of "we represent midnight as zero internally, and zero is false-y, so midnight should be false-y." This logic does not make sense to me. If they represented noon as 0 instead, should that evaluate to false, just because?


If you need to use a test like that, you've already gone too far down the road of lazy "falsey" value usage, and whoever wrote it didn't bother with the idea of sensible defaults or consistent typing/initialization.

I agree that midnight evaluating to false is ridiculous (I wasn't aware of this, thankfully I've never run into it). I'm not sure where you got the idea that I support this, but I don't.


It's functionality that someone went out of their way to create:

    def __bool__(self):
        if self.second or self.microsecond:
            return True
        offset = self.utcoffset() or timedelta(0)
        return timedelta(hours=self.hour, minutes=self.minute) != offset
http://hg.python.org/cpython/file/302c8fdb17e3/Lib/datetime....

Now that I look at the actual code, it makes less sense. It's not just midnight that evaluates to False. It's midnight UTC that evals to False. So if your timezone is PST, then 8AM evals to False. Since most datetime.time() objects are timezone 'dumb' by default, this distinction doesn't matter so much, but I can see this still biting some people hard, even if "Midnight is false-y" made sense.

Edit: Also, to the "explicit is better than implicit" crowd, notice this line (in core libs):

  if self.second or self.microsecond:
      return True
It's not:

  if self.second != 0 or self.microsecond != 0:
      return True
What could be more 'Pythonic' than the Python core libraries?

Edit: s/EST/PST/ ^^;;


Perhaps it started because as time zones were added, they wanted to reflect that 8am_pacific == midnight_utc, assuming there aren't any offsets for daylight savings in effect...


It's a rule of thumb, not a mantra.

I suppose the hilarious way to argue it would be to say that there is an implicit "all other things being equal" in front of each of those.


Call it what you want; I'm not arguing it to be pedantic, I think it's a very practical approach. But if you prefer:

All other things being equal, I see no reason to write code that needlessly introduces potential bugs and maintainability issues by ignoring the rule of thumb that explicit trumps implicit.


It's an old argument. I don't have real strong feelings about it. I think there is enough code doing one or the other out there that the end result is to understand well the ramifications of both, so it ends up a matter of taste.

The documentation is pretty clear about it:

http://docs.python.org/release/3.3.4/library/stdtypes.html#t...

Which leads to this:

    >>> a=object()
    >>> bool(a)
    True
    >>>


Once upon a time there was a language called Smalltalk that was the grandfather of dynamically typed languages. Perhaps partially as a consequence of implementing if else as methods on the boolean classes taking block closures to be executed only booleans were allowed(anything else is a missing method). If you wanted to test for null(Nil) you had to say something like

self foo isNil ifTrue: [self defaultFoo]

later some implementations added a method on object to test for nil

self foo ifNil: [self defaultFoo]

if you had to test for nil before testing for a condition it would be a bit ugly.

self foo ifNil: [((self foo) isFooish) ifTrue: [self defaultFoo]]

In ruby which was developed about a bit later perhaps to late to influence pythons bool handling they got rid of this problem by treating either nil or false as Falsey and everything else as truthy.

In python everything is true except:

    None

    False

    zero of any numeric type, for example, 0, 0L, 0.0, 0j.

    any empty sequence, for example, '', (), [].

    any empty mapping, for example, {}.

    instances of user-defined classes, if the class defines a __nonzero__() or __len__() method, when that method returns the integer zero or bool value False. [1]
http://docs.python.org/2/library/stdtypes.html In python 3 __nonzero__ is __bool__ The reason for 0 being a truthy value arguably is arguably due to historical reasons. false used to be 0 and truth used to be 1, when they introduced a bool type they made it be an integer. Some of the important python people(who are much better programmers then me) may disagree(http://stackoverflow.com/a/3175293/259130) but I think it was a ugly stain and should have been removed in python3 along with the whole 0(of any numeric type) being falsey.

The upside to python behaving like this is that in many situations you don't care what kind of falsey value you may have and you can write

if person and person.name: instead of if person != None and person.name != None and person.name != "":

and

if students and "John" in students and students[John"].age and students[John"].friends: print "John has friends" instead of if students != None and "John" in students and students[John"].age != None and len(students[John"].friends) > 0: print "John has friends"

There are downsides to this and people may have a strong personal preference for personal scripts and you can argue that the ruby way or even the smalltalk way is conceptually nicer(and more "explicit") but this is proper pythonic style and generally you should use it in python(maybe not the numerical one except when getting len of a collection)(unless you specifically depend on different code for different falsey values).


Not in python.

    if var:
        ...
is better in some situations, but is objectively worse in many others. If you really do want anything that's "falsey" to fall into that conditional, then by all means, use it! Just be aware that `0`, `[]`, `None`, `0.0`, etc will all be treated the same.

However, it's harmful in one of its most common use cases: default values.

For example, let's say you're working with a function that takes an optional argument similar to:

    def foo(x, values=None):
        if values is None:
            values = []
You shouldn't use a mutable default argument for several reasons, so instead you make the default "None" and set it to an empty sequence. The snippet above is the standard idiom.

Let's say a user mistakenly passes "values=0" instead of "values=[0]".

If you had done:

    if not values:
        values = []
Then the code will happily proceed with "values" being an empty list and _silently give incorrect output_ instead of raising an error a couple of lines later.

You can make the (very reasonable) argument that this is all the fault of dynamic typing, and if python was just a staticly typed language, the compiler would catch all of this, but that's beside the point.

Be aware of what you're testing if you choose "if var:" instead of "if var is not None:"


default values is a special case (and one which I consider to be a bug in python, frankly, since it bites everyone who didn't read up on default arguments before the wrote some code - everyone uses [] as a default argument at some point and then spends an amount of time proportional to the complexity of their program debugging the insane behaviour before noticing. My program was quite long and complex, and I am still bitter.)

I didn't argue, and certainly didn't mean to imply, that it is always better, but in the sort of case we're considering in this example, i.e. we expect a valid time value or None, It really is madness to imply a time value of midnight is "empty".


Nice job explaining away one language bug with another.


"Also, beware of writing if x when you really mean if x is not None -- e.g. when testing whether a variable or argument that defaults to None was set to some other value. The other value might have a type (such as a container) that could be false in a boolean context!"

From PEP-8: http://legacy.python.org/dev/peps/pep-0008/


They do two entirely different things in python. Occasionally you might actually want the functionality of the first statement, but if you write the first statement and expect it to do the same as the second statement, then that is a bug in your code.

You could argue that the python is badly designed in this case and that the two should be equivalent, but that's a different argument.


That is a rather silly statement since the two do not have the same behavior. So it is not a subjective question: one is right and the other is wrong.

(For certain types, they do have the same behavior. In those cases, the subjective question is fine. But times are not a case where the behavior is the same.)


I prefer the second one.

Explicit is better than implicit.


If explicit is always better than implicit, why is Python dynamically typed?


Because at the time it was originally written, it seemed impossible to write a statically typed language that was beautiful, or even readable.

This has now proven to be false, and python is starting to add optional type annotations. It's not practical to introduce static typing more quickly - it would break too many existing programs.


So why is it better to have to check for None but not having to check for midnight?

The latter seems a lot less common and surprising.


The second what?


The second example:

    if var is not None:
        ...


I was making a joke about being explicit versus implicit ;)


By what criteria have you decided that?


quicker to scan-read ("code will be read many more times than it is written or modified", being the relevant axiom)


Reading code is a purposeful activity, though; namely, to understand what the code does. The speed with which that can be done is relevant in the margins from a productivity standpoint, but to build an argument for code structure around it strikes me as missing the point of the "axiom" you quote.


One of the core tenets of Python is that "explicit is better than implicit". So why is your way better than an explicit check?


No, it doesn't just different mistakes.

If they wanted to make people write better code they would throw an error on coercion from time to bool so that everyone would have to fix the code.


I would agree with his approach, but unfortunately the Python community has encouraged, to some extent at least, the use of implicit "truthiness" to do things like check for None, rather than explicit checks.


I'm not sure if I'd say that it's the Python community that has encouraged that practice.

I think it's more likely just a bad habit that some people may have picked up when using PHP or JavaScript, and mistakenly continued using after moving to Python.

PEP 8 (http://legacy.python.org/dev/peps/pep-0008/#programming-reco...) is quite clear about this:

"Comparisons to singletons like None should always be done with is or is not, never the equality operators.

Also, beware of writing if x when you really mean if x is not None -- e.g. when testing whether a variable or argument that defaults to None was set to some other value. The other value might have a type (such as a container) that could be false in a boolean context!"

PEP 8 is more representative of the actual Python community's views than the mistakes of some programmers who are more accustomed to certain languages that are not Python.


INADA Naoki's argument [1] is succinct and insightful.

  I feel zero value of non abelian group should not mean
  False in bool context.

  () + () == ()
  "" + "" == ""
  0 + 0 == 0
  timedelta() + timedelta() == timedelta()
  time() + time() => TypeError
[1] https://mail.python.org/pipermail/python-ideas/2014-March/02...


If that were true, it would mean that dict should not support bool:

    >>> {} + {}
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unsupported operand type(s) for +: 'dict' and 'dict'
    >>> bool({})
    False
Since I find bool(a_dict) to be very useful, and the lack of a_dict+b_dict to be reasonable (precisely because it's non-abelian!), I conclude that logic is cute, but not that relevant.

Those who think it this argument is pursuasive, should bool({}) raise an exception? Or should {1:"A"}+{1:"B"} not raise an exception and instead return ... what?

Or do I misunderstand something?


Honestly, I would be fine with nothing supporting bool, and requiring `if` conditions to be bools. Arrays and dictionaries could have methods like `.empty?` that return a bool. Strings could have `.blank?`. Obviously, I'm stealing the methods from Ruby, although Ruby also has its own notions of truthiness of non-booleans.


There are many languages which do something like that. Python is not one of them. Personally, I find it useful to write things like "if x:" and not worry about if x is None or x is s list of characters or x is a string of characters.


I guess it comes down to the much more fundamental dichotomy between dynamic and static typing. I think most programmers would agree that having a function that accepts strings, lists, and dicts into the same argument (and thus sends them all through the exact same code path) is usually a bad idea. In Rails, I often find myself calling `to_i` or `to_s` just so that I can safely reason about that variable in the subsequent code.


Which means you don't like generic programming or templates. That in turn tells me something about the types of programmers you know. For example, neither you nor they likely use the Boost libraries.

In any case, it's a bit of a distraction. "Most programmers" aren't all that good at programming, or judging what makes a good language. How important then is it that I weigh your projection of their ideas of right or wrong?

In Python it absolutely, positively, without a doubt is not a bad thing to have a function which accepts multiple data types. Some trivial examples are len() and iter(). Since you don't like the core language design in Python, I think it's safe to argue that your sensibilities are going to have many objections to any other part of the language.


Do any of your examples (generic programming, templates, Boost libraries) actually utilize the concept of testing the truthiness of an object of unknown type? I think your criticism is invalid. I like generic programming in Java, though I admittedly have not used templates or Boost libraries for anything substantial.

> In Python it absolutely, positively, without a doubt is not a bad thing to have a function which accepts multiple data types. Some trivial examples are len() and iter().

I'm not arguing that this shouldn't be the case. My argument is specifically about bool(). If len() or iter() worked as strangely with basic data types as bool(), then I would probably extend my argument to cover them as well. For any data type I'm aware of that works with len() or iter(), the logic of what is returned is pretty clear and obvious, but that's not the case (in my opinion) with bool(). If len(-5) returned 5 (i.e. absolute value or "length from zero") then I would argue that it's a bad idea.


C++ doesn't have unknown types, so we're working with different definitions. Here's an example:

    [xebulon:~/tmp] dalke% cat tmp.cc
    #include <iostream>
    
    template<typename T>
    int f(T s) {
     return s ? 2 : 0;
    }
    
    main() {
      std::cout << "int 0 " << f(0) << std::endl;
      std::cout << "int 8 " << f(8) << std::endl;
      std::cout << "float 0.0 " << f(0.0) << std::endl;
      std::cout << "float -1.0 " << f(-1.0) << std::endl;
    }
    [xebulon:~/tmp] dalke% g++ tmp.cc
    [xebulon:~/tmp] dalke% ./a.out 
    int 0 0
    int 8 2
    float 0.0 0
    float -1.0 2
The function f() doesn't know the type, but its instantiation for f(0) (integer) and f(0.0) float know the type.

C++ containers don't have a bool (or at least vector<> doesn't). For one, until C++11 there was no "explicit operator bool", and a simple "operator bool" was too permissive because of implicit type conversion. C++11 introduced a more contextual conversion to bool.

More and more components have explicit bool support. For example, http://www.boost.org/doc/libs/1_55_0/libs/smart_ptr/shared_p... ?

    Notes: This conversion operator allows shared_ptr objects to be
    used in boolean contexts, like if(p && p->valid()) {}.

    [The conversion to bool is not merely syntactic sugar. It allows shared_ptrs
    to be declared in conditions when using dynamic_pointer_cast
    or weak_ptr::lock.]
and http://en.cppreference.com/w/cpp/memory/unique_ptr/operator_... .

I do not track C++ well enough to go into any more depth than this, especially as it concerns the history and future.

I like to say "if x: ..." and not "if len(x) == 0:" in order to check if a dictionary is empty.

Following Scheme, I can understand that something like "if empty(x)" might be more explicit. But I have a decent amount of code where I do something like:

   def process(a, rename=None):
     if rename:
       a = [rename.get(x, x) for x in a]
     ... do things with a ...
In this toy example, "rename=None" indicates that there is no renaming dictionary, and using {} as the renaming dictionary won't rename anything, so the "if rename" tests for both conditions correctly.

Without bool, I could write it as:

     if rename is not None:
since using the empty dictionary in this case is okay, but if I want the slight extra performance for the empty dictionary case I would have to write it:

     if rename is not None and not empty(rename):
I think that would grow tedious.

BTW, I don't use "def process(a, rename={}):" because that is one of Python anti-patterns: the default values are constant over the life of the function, so

    def something(a, b={}):
      b[a] = a
      return b
will accumulate to b rather than create a new dictionary each time. Thus, seeing a "={}" or "=[]" in a parameter list demands closer inspection because it often leads to errors.


bool({}) should raise an exception, if we're being strict.

You could try to define the addition of keys like you suggest, but you need to force a value type which defines a + operation into the same type. Which is what Counter() does, below (appending strings).

Python is dynamically typed, so you can't maintain these expectations. Therefore, summation and bool({}) should be an error, or merely "true" (ie, yes, it exists, and is not None) -- as per the original thread and resolution.


Good thing Python isn't a strict language then. That would be Scheme. ;)


{1: ["A", "B"]}?


Okay, then what about

  {1: "A"} + {}
Should that return {1: ["A"]}? In which case {} is no longer a zero value.

What about

  {1: "A"} + {} + {1: "B", 2: "Z"} + {1: "C"} + {}

In any case, your suggestion would break all existing code. Based on all the StackOverflow and python-list questions, most people expect one or the other value, and few want the two-element list containing both. (And why a list instead of a tuple?)

Guido chose the current behavior because there is no clear-cut winner. "Refuse the temptation to guess."


This could be interesting. One would have to define what hash math should mean. Personally, I would expect (based on my naive assumptions here) that:

    {1: "A"} + {} == {1: "A"}
but

    {1: "A"} + {} + {1: "B", 2: "Z"} + {1: "C"} + {}
should reduce to:

    {1: ("A" + "B" + "C"), 2: "Z"}
Whether you can add "A", "B", and "C" would depend on string functions (could be "ABC" or raise an exception that these cannot be added for example).


How does {1:"A"}+{} = {1:"A"} imply that {} is no longer a zero value? 1+0=1 makes total sense...


It's not just that one expression, but when done in addition to the previous suggested answer. That is, what is the type of the values in the summed dictionaries?

If {1:"A"}+{} == {1:"A"} then the type of each value is unchanged from the input list.

If {1:"A"}+{1:"B"} == {1:["A","B"]} then the type of each value (or at least each shared value) is a 2-element list.

If both are true, then no one can write sane code which knows how deal with the sum of two dictionaries.

That is, the insane code has to have checks for if one of the dictionaries is empty, and if so, do one thing, otherwise do another thing. In which case, why is this definition useful?

Therefore, either {} is the zero value, or dictionary summation produces lists for the resulting values, but not both.


In the comment above, the hypothetical result is {1:["A"]}, not {1:"A"}. (The value is in a list.)


Very little code that expects "A" or "B" would be equipped to handle ["A","B"].


{1:"AB"}

That's how it usually works with functions.


If you want that, try the Counter class:

    >>> from collections import Counter
    >>> Counter({1: "a"}) + Counter({1: "b"})
    Counter({1: 'ab'})
But Python dictionaries won't support '+' because of the ambiguity. This goes back to 1994: http://ftp.ntua.gr/mirror/python/search/hypermail/python-199...

    Lance Ellinghouse:
    why can't I add dictionaries like I can other objects?
    ...
    if a and b both had the same keys, then I think b should override
    the keys in a.
and Kenneth Manheimer posted a dict-like implementation in http://ftp.ntua.gr/mirror/python/search/hypermail/python-199... which also has the replacement behavior.

Guido responded in http://ftp.ntua.gr/mirror/python/search/hypermail/python-199... saying that he could think of no reason to prefer replacement over keep behavior.

In my own early Python code, I sometimes used + when I wanted an update semantic, so I can confirm that your expectations aren't universal.

And that's what this comes down to - is bool(time(0, 0))==False the generally expected behavior (which is a higher threshold than reasonable behavior)? What are the likely errors? Do the advantages outweigh the disadvantages?

That's why an argument about the zero element is only somewhat relevant - most programmers aren't mathematicians, and practicality beats purity.


This is neat:

    >>> a = Counter()
    >>> a[1] = a
    >>> b = Counter()
    >>> b[1] = b
    >>> a + b
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "...../collections.py", line 536, in __add__
        newcount = self[elem] + other[elem]
      File "...../collections.py", line 534, in __add__
        result = Counter()
    RuntimeError: maximum recursion depth exceeded
If dict() implemented add as you propose then it would be subject to the same errors. This observation doesn't make your proposal wrong. I mean it to point out the subtle complexities which might come up if a large number of people don't know what it's supposed (and even documented) to do.


That's an interesting point, but I'm inclined to say that very few languages could come up with {1:this} - Prolog could, I think - and that running into those errors in Python would be user error.

That said, this is a good example of the expressivity of this, which mathematical definitions often implicitly disallow.


An analysis shouldn't stop at "user error". Instead, ask if the programming language design plays a role.

For example, NUL terminated strings lead to a lot of user errors, some of which lead to security holes. Other string representations don't have that flaw, though come with a different cost concern. Is a buffer overflow "user error"? Some would say it is. But the language design makes those errors more dangerous.

Or in Python, there's no technical reason to have a ":" at the end of the line before an indented block. Instead, it's there because user studies show that people learning ABC (which influenced a lot of early Python) made fewer mistakes if the ":" was present than if it wasn't.

Are indentation mistakes user error? Certainly users play a role. But again, the design does as well.

So saying that something is a "user error" with no further analysis absolves the designer of any responsibility, and I disagree with that idea.

There are at least 4 different ways to handle dict+dict in Python. At least three have come up in this thread as the proposed correct solution. Even if there is a mathematically clean solution, if only 10% of the people expect it to work that way, then why should Python introduce something which is so error prone? No support for dict+dict is 100% error prone, of course, but trivially identified in testing. While Counter+Counter-like behavior has subtle consequences that will trip people up.

(As another example of the subtleties, consider an inverted index mapping word to a list of document ids:

    collection_a = {"a": [0, 3, 4], "the": [0, 2, 3]}
    collection_b = {"a": [5], "an": [6, 7]}
    collection_c = {"the": [10], "not": [10]}
    merged_collection = a+b
    merged_collection += collection_c
This is wrong because the += changes merged_collection["the"], which is the same list as collection_a["the"]. So even though it looks like good code, and it is good code for any value where x+=y is the same as x=x+y, it may cause problems which are hard to spot.)

"In the face of ambiguity, resist the temptation to guess."


Abelian group is a sort of stringent requirement, they probably just mean a plain monoid? After all it works for python lists, which form a non-commutative monoid:

    [] + [] == []
    [1,2] + [3] != [3] + [1,2]


The additional property which makes a monoid a group is the existence of inverse elements. So list is a good example for your point, because there are no "inverse lists".


This is a nice clean criterion, but it's still pretty unclear to me why the zero element of a monoid should be considered falsey at all.


Because when writing a recursive function, the zero element is generally the base case.

    if x:
        <destructure x, recurse on subparts>
    else:
        return <something>
This applies to (natural) numbers as well--'destructuring' usually means decrementing the number.


Came here to say this. It's the most useful statement in the entire thread of "you're wrong!" "no, you're wrong!" and nobody noticed.


Perhaps people ignored it because it isn't true? {}+{} fails, because addition of dictionaries is non-abelian and Python refuses to choose a preferred result. But bool({}) is perfectly reasonable.


I just tried this:

    $ python
    >>> import time, datetime
    >>> () + ()
    ()
    >>> "" + ""
    ''
    >>> 0 + 0
    0
    >>> datetime.timedelta() + datetime.timedelta()
    datetime.timedelta(0)
    >>> time.time() + time.time()
    27848327051.323207
Seems like time() + time() is not a TypeError in Python.


Wrong time(), you want datetime.time().

time.time() returns the seconds elapsed since the epoch.


What do those code samples have anything to do with abelian groups? Abelian groups are those whose group operator is commutative, i.e. a + b = b + a for all a and b.

The code examples seem to be referring to the existence of an identity element, which is necessary for all groups.


I think what he was saying was: * You should use + only for operations that form an abelian group across the domain * In those cases, it makes sense for the identity to be false.

The truth is, time doesn't have a sensible addition/combination operator, never mind an identity and inverse, so it isn't even a group.


Okay, I definitely buy that. It makes sense to add a duration (timedelta in Python) to a time, but not to add times to one another.

This is venturing off topic, but if you really wanted to, you could define addition for times in terms of timedeltas. It makes sense to subtract times and get a timedelta, so you could define addition as the subtraction with signed times and signed timedeltas. So, since `time(today) - time(yesterday) = timedelta(1 day)`, we could say that `time(today) + time(yesterday) = -timedelta(1 day)` and `time(today) - -time(yesterday) = timedelta(1 day)`. It's a bit strange to define the notion of a signed time object, but it could work.


Yeah, I was trying to figure that out too. They seem to care about closure and identity, but most of the examples have operators that trivially break an Abelian group's commutative requirement.


But you can make an equally succinct argument in the opposite direction, namely to treat time values as polar coordinates. Then they indeed form an abelian group:

    a + 0:00 = a
    15:00 + 13:00 = 02:00
    12:00 + 12:00 = 00:00
The examples are for hours only, but you can extend it for minutes too. Showing that time values on a clock forms an abelian group under addition is a textbook example everyone uses when introducing group theory.


But it's wrong.

I'll assume the times you gave were in the UK's timezone. Here they are rewritten to use Germany's timezone.

    a + 1:00 = a
    16:00 + 14:00 = 03:00
    13:00 + 13:00 = 01:00
Still think this makes sense?

Time deltas? Then you don't have the timezone or epoch problem. Works fine.

But time stamps? You can't add two time stamps and get a sensible answer.

Time stamps form a torsor, not a group: http://math.ucr.edu/home/baez/torsors.html


You're right, but time values with timezones are completely different types from time values without. They follow different rules and need to be treated differently.


And this is why programmers who know math are better programmers.


Adding times is semantic nonsense. It's irrelevant how this relates to Abelian monoids. Mathematically beautiful form is not a virtue if it just creates bizarre and confusing interfaces.

edit: last Tuesday, 5:00pm + Christmas two years ago at noon = ???


An ironic comment, considering that appears to me to be a misuse of "abelian group." Abelian groups are those whose group operation is commutative, which has nothing to do with the code examples provided.


It doesn't make sense that times should be added at all.


I've never seen a good argument for anything beside "false" to be considered false. Likewise for "true". Keystrokes are not a commodity for most coders, and compilers are not dumb; just be explicit and write "!= 0" or whatever.

(And 0 == False, "" != False, but both 0 and "" are considered false? C'mon Python, that's borderline JavaScript territory.)


I'd say null/undefined is a good candidate for a falsey value since existence checking is a very common operation. Conceptually I'd rather check what the incoming variable is, as opposed to what it is not. Anything else is dangerous or incredibly dangerous ( e.g. 0 == false ). Besides empty variables, like [] or {} or "", are usually handled pretty well by code that works with non-empty equivalents so there's no conceptual or pragmatic reason to have those evaluate to false.


Python has truthiness and falsiness as well.

That shortens code like:

   if not x: 
      do_something()
x could be 0, None, False, empty set set(), {}, [], (), '',"" or an I believe any class or object that appropriately overrides __nonzero__() or __bool__() methods.

It is a matter of taste, so I rather like it.


I appreciate falsiness in python as well (especially for None, [], {}, set([]))... Interestingly I think zero is arguably the most dubious of the falsies (not that I'd recommend changing it, for obvious backward compatibility pain), but I can remember being burned by bugs from zero evaluating False a number of times, yet I can't once remember being burned by any of the other values I listed burning me.

I do hope the ticket in question is reconsidered though. Midnight is obviously a "valid and populous instance" that shouldn't evaluate False.


> I appreciate falsiness in python as well (especially for None, [], {}, set([]))... Interestingly I think zero is arguably the most dubious of the falsies (not that I'd recommend changing it, for obvious backward compatibility pain), but I can remember being burned by bugs from zero evaluating False a number of times, yet I can't once remember being burned by any of the other values I listed burning me.

Agreed, "0 as False" sounds like "let's take a leaf from C and its bad type system". I generally rely on falsiness as well, except when dealing with number, and the inconsistency is annoying. I also agree that midnight is not inherently more false than noon or any other date. But I guess that's a risk with falsiness, just like it is with other magic methods: sometimes you take the magic too far.


"0 is False" is extremely useful to a Pythonista. Ever indexed into an array with a boolean offset? Yes, you can do that. (False is 0, True is 1). Makes for very succinct code.


    >>> bool() == int()
    True


Well, if you read the mailing list thread there are several core python maintainers saying that is bad style. So I wouldn't call it just a matter of taste when core maintainers disagree with it and parts of the standard library break your code when you do it.


> re several core python maintainers saying that is bad style.

Well I am a python programmer that has been using it for almost 10 years, 7 full time, and I say it is nice. So yes, I would call it as a matter of taste.


It's not a matter of taste when it breaks functionality.


It depends on what x is and what do_something() does.

Most often than not in my code that wouldn't break functionality. In the case of dates it is broken, so same code pattern there would be broken.

At this point a lot more functionality would break if 0 become non-False or threw a TypeError exception.


Yes, the whole truthiness thing is bug-prone.

I think the history of Lisp is instructive. In traditional Lisp all the way through Common Lisp, the empty list, also known as NIL, is false, and everything else is true. This is a relatively benign form of truthiness, but even so, the type-punning it invites tends to make programs less clear (even though they're a little shorter).

Since Scheme was a clean sheet design, Steele and Sussman had the opportunity to change this, and they did: Scheme has explicit '#t' and '#f' literals, and no implicit conversions from other types to boolean. I think their reasoning was sound.


I really don't think there is a good argument. Any time saved in writing the code is inevitably repaid in debugging it.


If you are talking about people writing 0==False in stdlib I argue that's a bad code. There are times when you don't want to distinguish 0, False, None, or empty. Unlike JS which has == and ===, Python only offers == I like that. Coding like !=0 is very C like to me, or not taking advantage of Python. In the Python world, it is common and advised to test trutfulness until you have to test the actual value. That is, you can get away with a lot of if object instead of if object.size == 0 or if function() == 0, i think you can do this by implementing your own __eq__ and __nq__ for the object. Imagine someone has multiple values to return depending on the logic branch, the following is arguably ugly and bad, but say it happens:

   if cond1:
     return -1
   elif cond2:
     return -2
   elif cond3:
     return -3
Unless there is a good reason to return these ints, if you only care about the semantic (to distinguish which branch in the caller's), just raise your own exception.

   if cond1:
     raise cond1Exception
   elif
     raise cond2Exception
So there are ways to go around and make code more readable. is this a bad coding? Each has its own taste.


Starting to raise exceptions for non-exceptional conditions is bad code. And yes, I know that's how generators signal they are exhausted, but it's still bad code.


> And 0 == False, "" != False, but both 0 and "" are considered false?

Also, 1 == True, "x" != True, but both 1 and "x" are considered true. I don't quite understand why you find this so surprising.

The most problematic thing here, IMO, is that bool is a subtype of int in Python.


I sometimes think that way, but when I do ask I ask why you can use any non-Boolean in a conditional? That's optional too.

As for uses: because empty lists, zero and empty strings are false, you can do things like:

if validation_errors: if special_instructions: if unusued_widgets:

Of course those are just saved keystrokes, and can lead to bugs, but so is being able to throw a list into a condition.


This is because `bool` is a subclass of `int`. Which sort of makes sense to C programmers, for better or for worse.


Why is it unreasonable for comparisons to behave differently than Boolean checks?


I actually wish it would also return False when the variable is undefined


I just got bit by this a few days ago. I was creating an event scheduling system that uses either repeating entries with a datetime.time, or one time entries with a datetime.datetime. I had code that said "if start_time" to see which it was, and discovered later that midnight evaluates to false. It's not the best idea.


What's actually interesting if you dig into the discussion is that the actual behavior depends on your timezone.

Basically from UTC east to the IDL, it behaves as you describe it. Anything west, it evaluates to true as long as it is set and there is a timezone offset. This is because once you are at a negative offset, you subtract and get a higher number so it will never evaluate to false with a negative offset.

Basically this current behavior is useless for anything I can imagine. You can't use it to check for midnight UTC. You can't use it to check for whether it is set. You can probably use to check for midnight with no timezone but add a timezone and now your handling is messy.

This is why contracts need to be sanely written from a code contract point ahead of time. This was apparently a developer describing how his code worked and later calling it a contract, which is a very bad thing :-D


Ignoring Python for a bit and thinking as a designer of some hypothetical future language: there is a nice rule given here for evaluation in a Boolean context. I wonder whether it should be taken as a general guideline for future languages.

The rule, in its entirety, is this:

- Booleans are falsy when false.

- Numbers are falsy when zero.

- Containers are falsy when empty.

- None is always falsy.

- No other type of value is ever falsy.

I can think of two ways we might possibly want to alter the rule.

The first is to expand the idea of number to include arbitrary groups (or monoids?), with the identity element being falsy. So, for example, a matrix with all entries zero might be falsy. Or a 3-D transformation might be falsy if it does not move anything.

The second is one I have encountered in C++. There, an I/O stream is falsy if it is in an error state. This makes error checking easy; there is one less member-function name to remember. We might expand this idea to include things like Python's urllib, or any object that wraps a connection or stream of some kind.

EDIT: OTOH, there is the Haskell philosophy, where the only thing that can be evaluated in a Boolean context is a Bool, so the only falsy thing is False.

EDIT 2: The comment by clarkevans (quoting a message from INADA Naoki) already partially addressed the above group idea: "I feel zero value of non abelian group should not mean False in bool context."


Making anything besides booleans "truthy" strikes me as asking for a lot of trouble for very little gain. For each of these cases, how hard is this to write:

    x == 0
    isempty(x)
    x == nothing
How much clearer is the intent of that code than a truthy boolean test would be? This clarity is all the more important in dynamically typed languages without type annotations since the code itself doesn't give any hint what the type of `x` might be, so the programmer doesn't know what it is the test is really checking for.

Oh the other hand, the cost of truthiness is pretty significant. Each person reading or writing code in a language with truthiness must remember all the arbitrary truthiness rules that particular language uses – and they're quite different for each of Python, Perl, Ruby, C, JavaScript, etc. As this issue demonstrates, whenever you open the door a little, even in a generally sane language like Python, at some point some weird decisions get made and you find yourself with some strange corner cases like midnight being falsey. If you wanted to know if a date was midnight, wouldn't it be easy enough to just explicitly test that?


Completely agree. C# has this right. An expression must produce a valid boolean value, the language does not evaluate any random type as boolean based on a set of rules.

E.g. you can't say following in C#

if(number) //number is of type integer.

if(o1) //o1 is a reference type variable.

if("string")

if(number = 10) // number = 10 is an assignment and produces 10.

You must write:

if(number == 0)

if(o1 != null)

if("string".Length > 0)

if(number == 10)


James Coglan recently pointed out that all of Python's falsy values are the additive identity of some type. Midnight fits the mold.

This results in some weird results from an intuitive perspective, but is very principled and elegant in other ways.

My one objection was that I don't know how None fits in.


Midnight is not the addittive identity of points in time. There is no additive identity because there is no addition operation. You're confusing the ``time`` type, with the ``timedelta`` type, which represents a duration, and does have 0-minutes as an additive identity.

midnight is not, nor has it ever been, the "zero-time", it's simply a point whose typical representation contains some 0s. It is no more Falsey than the origin (0, 0) in the cartesian coordinate system is Falsey.


You win: I was confusing those things! I was thinking times were treated as a timedelta ranging from 0 to 23 hours 59...


How is "midnight" an additive identity? It's not a time delta. "+0 hours" is an additive identity. (And if Python's time library doesn't distinguish between absolute and relative times, that's a design flaw.)

That's like saying "north" is an additive identity on a compass. It's not; "+0° clockwise" is an additive identity.


To take it from the same James Coglan -> "You can't add midnight to 3-o'clock in the same way you can't add London to Chicago."


As far as I understand from reading that whole thread, this is, ironically, precisely why local midnight evaluates to False. They left in an escape hatch to support the illogical idea of addition.


Lots of Python objects are falsey: empty lists, empty strings, etc. So it's never a good idea to write "if <thing>" when you mean "if <thing> is not None".

This is pretty well-known, I thought.


A big part of it is about user expectations. I would be shocked if any experienced Python programmer who wasn't familiar with this exact implementation detail expected the following:

    if datetime.time(0, 0, 0):
        print "foo"
    else:
        print "bar"
to print "bar"


Oh man people that have used python for a while have come to expect datetime to just be strange in general. Essentially the default is to be in your TZ from midnight, you need to convert and there are different classes for epoch, and some of the members seemingly willy-nilly are not there or named differently then! And that's just the start, that's why things like arrow exist: http://crsmithdev.com/arrow/ time and datetime in python show their age.


Falsey objects mean "there's no data here" which is why they're falsey. Having an empty list of arguments to a command is the same as having no argument list.

There's nothing about midnight that makes sense for it to behave as Falase.


But, it's represented internally as zero. That's reason enough!


That's an implementation detail and should be open to change without breaking things.


My post was meant to be sarcastic. I thought it was ridiculous enough to not need something indicating sarcasm though...


Off the top of my head I can't think of a reason to check if a date exists, but I would certainly expect midnight to be truthy if I found a reason.


What about optional date fields in a Django application?


you check if it's None or for the type.


In Django you might have a nullable TimeField and check if it's set via "if obj.time_field".


I always write "if x is None", trying to be explicit.


Whilst reading that thread, I stumbled accross:

  "goto fail" is a well-known error handling mechanism in open source 
  software, widely reputed for its robusteness:
  
  http://opensource.apple.com/source/Security/Security-55471/libsecurity_ssl/lib/sslKeyExchange.c
  
  https://www.gitorious.org/gnutls/gnutls/source/6aa26f78150ccbdf0aec1878a41c17c41d358a3b:lib/x509/verify.c
  
  I believe Python needs to add support for this superior paradigm.
  
  It would involve a new keyword "fail" and some means of goto'ing to it. 
  I suggest "raise to fail":
  
  if (some_error):
     raise to fail
  
  fail:
        <error handling code>
  
  Unless there are many objections, this fantastic idea might be submitted 
  in a (short) PEP somewhere around the beginning of next month.
  
  There is some obvious overlap with the rejected "goto PEP" (PEP 3163) 
  and the Python 2.3 goto module. However, the superiority of goto fail as 
  error generation and error handling paradigm has since then been 
  thoroughly proven.
https://mail.python.org/pipermail/python-ideas/2014-March/02...


Python already comes with the logical conclusion of goto-fail out of the box. There's no need to add a new special feature for it.

(There is absolutely no certificate store checking for certs by default, nor is there any hostname checking, or any of the myriad of other checks one might expect a reasonable TLS implementation to perform. Use the requests module.)


Picking on C's error handling mechanism is like picking on its non-existent standard library: it's been done to death.


I think he understates the most powerful part of his argument.

Midnight is a value, not a special value. There is no reason why it or any other valid time should be falsey on a daily cycle.


Couldn't the same argument be made about zero?


There are times when I like the fact that zero is false in Perl.

However, usually I am happy that '0 is false' in PostgreSQL raises a type exception.


Zero has important, special properties not shared by any other number.


Yes, let's change that too.


I think the interesting part is what is revealed about Python and the difference with something like Ruby.

Python is stable[0] and places a high degree of importance on backwards compatibility.

This behaviour is well documented (and called out for particular note). This reinforces that it is (a) official and (b) not a bug because it is the documented behaviour.

On the other hand Ruby (and most Ruby libraries) seem both less concerned with backwards compatibility, have less thorough documentation[1] but are more willing to change and improve.

There isn't a right and a wrong between these approaches although for most things I think I would prefer something between the two. I think I generally prefer Python in terms of syntax (Ruby is a bit too flexible with too many ways to do things for my taste) but I do wonder if Python will be left a little behind.

[0] Python 2/3 transition is a single big deliberate change.

[1] I have an open Rails issue that I don't know if is a bug or not because there isn't documentation that is sufficient to compare the behaviour with so it is a case of what feels right/wrong: https://github.com/rails/rails/issues/6659


I'm gonna disagree with you! Well, I actually agree about ruby libraries, but the ruby standard library is the proper comparison here, which in my experience manages backwards compatibility and documentation of edge cases rather nicely. The (many) changes to the language itself since 1.8 have nearly all happened in backwards-compatible ways. So maybe the ruby community has a more flexible attitude (I've honestly never used enough external python libraries to have a good comparison point), but I don't think it's fair to say that the language itself is.


I was deliberately discussing the whole library ecosystem you could well be right about the stability of the core of Ruby.


Not being a Pythonista, I have the following questions:

1) Is there a (native or custom) date type in Python? Is it an object?

2) Midnight when? Today? This date last year? Sure there's a "zero value" for dates - it's the epoch for whichever platform or library you're using.

3) Why in would anyone call it a "date" if it's really a time?

Maybe I'm getting off into the philosophical decisions of the reptile wranglers, but this particular debate sounds a lot like someone made a decision long ago that had ramifications further than expected and now the justification is engrained, things are built on it, and no one's willing to make the 'correction.'


Python has dates, times, and datetimes[0]. They're all objects. So yes, you could get around this by always using a datetime. However, sometimes what you need is just a time.

[0] http://docs.python.org/2/library/datetime.html


It's midnight without a date, however if there is a timezone attached to the time then it's midnight utc unless the utc offset of the time is a negative value, then it's never.


The second paragraph of the link contains "datetime.time(0,0,0)".

I can understand not wanting to make many inferences, but that sort of answers your questions.


Python has weird ideas about comparisons, I'm pretty sure it's the only language where this is possible: https://eval.in/113749


That's been fixed in Python 3.


In Python 3, that raises a TypeError.


The only reason for midnight being a falsy value that I can think of is that someone thought that all objects should provide some functionality for __nonzero__/__bool__.

It was a bad idea.


This kind of crap is exactly the reason why I don't like doing just "if var:" unless var is guaranteed to be a boolean.


Midnight UTC is zero's all the way down. Seems false to me, but I'm from the land of C. This seems to be in line with some low level hardware or common assembly practice across many languages.

Everyone is talking higher echelons of consideration, but what effect is there on generated byte code or in fitting within the virtual machine's tight pants?


This offers a counterexample to the simplistic notion that 'duck typing' results in programs that automagically do the right thing. The reality is that duck typing does not relieve you of the responsibility of understanding the semantics of the elements you use to construct a program from.


On the plus side, Boolean Value: Midnight would make a great CS-themed action movie title.


This is freakishly similar to the discussion on a PHP bug I submitted in 2006:

https://bugs.php.net/bug.php?id=39579



While I agree this is surprising behavior and I wouldn't design an API this way, it is documented behavior. From the docs:

"in Boolean contexts, a time object is considered to be true if and only if, after converting it to minutes and subtracting utcoffset() (or 0 if that’s None), the result is non-zero"

Changing at this point would possibly break code that relied on documented library behavior. That's not a responsible thing to do.


Explicit is better than implicit.

Simple is better than complex.


I'm not sure what that is supposed to mean in this context. Testing for "is not None" is more explicit and avoids the trap being discussed. I don't know how the second sentence applies, maybe "if timeval:" is simpler?

BTW, do you know the original author of the Zen of Python has posted to this very dicussion (while also being the orginal author of said module):

https://mail.python.org/pipermail/python-ideas/2014-March/02... https://mail.python.org/pipermail/python-ideas/2014-March/02...


"Simplicity" is not measured in terms of key presses or characters.

The "if timeval:" case may contain fewer characters, but it's less explicit. Being less explicit opens it up to greater ambiguity. Ambiguity is a form of complexity. Complexity is the opposite of simplicity.

The explicit "is not None" check may require more typing, but it's far more explicit and exact. That means it's much less ambiguous, and thus less complex, and thus exhibits greater simplicity.


The word "simple" means many things to many people, it's probably the most ambiguous line in the Zen of Python.


I really have to wonder if there is actually any code out there that relies on it. Even given Python's great popularity, I would not be surprised if fixing this broke nothing at all.


That's why the user was suggesting deprecating the feature.


In every other language I've used, a time value of 0 is used when a datetime only contains a date and doesn't have a specific time. The existing behavior would make sense in that context. I know Python also has a separate date object, are the two interchangeable enough that you could mix and match without problems?


I came across a similar issue when using rails the other day, where I gave my model a boolean field that had presence validation. The presence validation of the boolean field fails if the bool is set to false, had me confused for a while, but It wasn't a big enough issue for me to research/report.


It seems there are two choices:

1. Before applying a numerical value to a Boolean test, ask whether it can ever be zero when that's not the intent of the test.

2. Create a new rule that forbids testing numerical values as though they're Booleans, and break nearly every program in existence.

Hmm ... wait ... I'm thinking it over.


Why would anyone evaluate dates in a boolean context? They are (should be) always True.


> They are (should be) always True.

This is the obvious intuition, which doesn't match the current implementation.

> Why would anyone evaluate dates in a boolean context?

I'd guess, usually to tell whether a variable has a date or None. For example, maybe you have some kind of GUI with a form where the user is supposed to fill in a time, and some variable somewhere in the code is set to None [1] when the time field is blank [2], and a time object representing the time entered by the user otherwise.

Maybe you want to require the user to fill in all fields in the form before advancing to the next step in your program's workflow. If you like falsiness, you might write:

    if not form.time_field:
        show_warning("You must enter a time to proceed")
        continue_current_phase()
    else:
        proceed_to_next_phase()
You actually intended the first line to be:

    if form.time_field is not None:
But the only way you'd find this bug is when a user on the East Coast complains that the form won't accept 7:00 (UTC midnight). It will accept 6:59 or 7:01, but not 7:00. Of course you're on the West Coast, so you close their ticket as "can't reproduce" since 7:00 works just fine for you...

[1] None is the Python equivalent of other programming languages' null (or NULL).

[2] Or unparseable


Creeping semi-booleans make me very uncomfortable. But what's the alternative? A-values and I-values? A "μ" for questions unanswerable in the type system? Just punt and let Javascriptisms take over the world?


This is how languages die. I wasn't aware that Python had become such a bureaucracy.

The current behavior is insane - just fix it! No need for days of discussion on the mailing list or three-point non regression plans.


Deprecate datetime and introduce datetime2 with better behavior for midnight. Problem solved.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: