SQL Gotchas
I think you can imagine the amount of paint involved in having a query behave in an unexpected manner. I have run into both of those recently. This one had me doubting my sanity (imagine this on a table with several hundred thousands records, with a fairly complex query around it:
select 1
where 1 not in (2,3,null)
select 1 where 'e' = 'e '
Comments
I assume that you were surprised that first select did not return any rows. It makes perfect sense if your database works with NULLS by the ANSI SQL rules. Under those rules comparing X to NULL is nor true nor false, it's unknown. So if I rewrite your first statement like this:
select 1 from MY_TABLE where ((1 <> 2) AND (1 <> 3) AND (1 <> null))
then the last comparision if unknown (and interpreted as false) . You can also try:
select 1 from MY_TABLE where (1 = null ) or (1 <> null)
and you will still get 0 rows. I tested on Oracle. Think that SQL Server by default works with nulls a little bit differently.
Regards, Petar
Petar,
I would expect it if I was thinking about it using hard coded, I would have expected it to work differently for in and not in.
It makes sense that it will need to maintain the same behavior, it just that it sucks, that is all.
And that behavior is on SQL Server as well
This is maybe on case whee LINQ comes in handy where these constant expressions are evaluated on the client:
First query in LINQ to SQL:
Clients.Select(i => 1).Where(i => "a" == "a ")
no result, translates to:
SELECT [t1].[value]
FROM (
WHERE @p1 = 1
-- @p0: Input Int32 (Size = 0; Prec = 0; Scale = 0) [1]
-- @p1: Input Int32 (Size = 0; Prec = 0; Scale = 0) [0]
Query 2:
Clients.Select(i => 1).Where(i => !(new List<object>{2, 3, null}.Contains(1)))
return all rows, translates to
SELECT @p0 AS [value]
FROM [Clients] AS [t0]
-- @p0: Input Int32 (Size = 0; Prec = 0; Scale = 0) [1]
Interesting...I wonder if that is really the correct way for linq2sql to handle it?
/Mats
Dear SQL (and T-SQL and PL-SQL):
DIAF.
KKTHX
You can use the SQL LIKE operator (without any percentage signs) to perform an exact match.
I'm afraid I did not understand exactly what you mean by that. You are not generating that ad-hoc SQL in runtime and than send it to db, I hope ?
The first one I can understand because of the funky NULL definition. Hard to justify the second one. Especially when SELECT 1 WHERE 'e' = ' e' works "as expected".
Hehe. I had exactly the same problem, and blogged about it a few months back...
http://www.tobinharris.com/2007/7/11/select-aaaargh-where-pain-in-null
Like an arrogant fool, I almost filed a bug :)
The problem is this:
"What's happening here is that a null is neither equal or unequal to anything, therefore when we ask if 1 is NOT in (null,2,3), the database can't say that's strictly true"
The second select is not hard to understand if we know that (on Oracle):
1) Text literals are treated as CHAR datatypes and
2) when comparing CHAR datatype trailing spaces are ignored.
So 'e' = 'e ' actually evaluates to true.
There is nothing strange about this. It's just the way database works.
Petar
Here's one I ran into at my last gig:
SELECT 1
FROM TBL_Empty
WHERE 1 = 2
And yes, TBL_Empty really was empty... not that it would matter.
Petar ,
Of course I am generating it in runtime :-)
More seriously, I meant that the real query was:
select * from source
where id not id (select parent from child)
Ayende: on MSSQL, try SET ANSI NULLS OFF. This makes all the comparisons involving nulls to behave as you'd expect them to, so you can do "Something = null" and "Something <> null" to mean "Something IS null" and "Something IS NOT null".
Comes in extremely handy with parameters:
SET ANSI NULLS OFF
SELECT * FROM Table WHERE ParentID = @parentId
-- @parentId can be null, which would yield all "orphaned" rows
Petar: You say "There is nothing strange about this. It's just the way database works". Of course there's something strange about this. The fact that this is the way the DB works is irrelevant - it is contrary to intuition and stands in contrast to every other programming language on the planet. That, my friend, is exactly what I call "strange".
The fact that you use "=" for a kinda-like-it comparison, but using the "LIKE" keyword gives you an exact match - well, that's just plain stupid.
I knew all about the first one already, but the 2nd one really made me say "hmmmm". The weird thing is that when I copy-and-paste the SQL code from my newsreader, it returns a 1, suggesting that a string is equal to the same string with appended spaces. What?
Upon closer examination, stranger things appeared:
A string that I append spaces to manually does, in fact, evaluate as being equality. I assume that this has something to do with converting between fixed-length CHARs and varchars.
The string that I copy from Firefox behaves exactly as I describe above.
The string that I copy from my newsreader evaluates them as NOT equal, returning no rows. A bit more investigation shows that the padding character copied out of my newsreader isn't an ascii 32, but is rather an ascii 160, a non-breaking space. Go figure.
ASCII has non-breaking space? That's news.
Avish: Try this:
Is this strange ? Is it contrary to intuition ?
Equals operator is dependent on its operators type. The fact that symbol "=" is used for many different equals operators doesn't matter. Comparing numbers is different than comparing CHARs or VARCHARs or matrices or complex objects. Or strings in C#.
I would imagine that 'e' = 'e ' returns true to make char and varchar comparisons "seamless" since char fields typically have whitespace padding if the value doesn't occupy the fixed length. If the above condition were to evaluate to false, then comparing CHARs and VARCHARs would be syntactically brutal (would need casts all over the place).
See this will teach me to not stay current with your blog posts.
I ran into this EXACT problem the day after you posted this, and I ended up writing my own post on the topic.
http://www.chrismay.org/2007/10/29/SQL+NOT+IN+Will+Fail+If+The+List+Contains+A+Null.aspx
Avish,
"it is contrary to intuition and stands in contrast to every other programming language on the planet. "
Be as it may with programming languages, I don't agree about your argument about intuition.
Null means that you don't know the answer to the question.
Say that I have compiled three different lists, each list containing the names of, say, all the congressmen and each list representing something compromising. By the name of each congressman, on each list, I can mark "yes (they have done it)", "no (they haven't done it)" or "null (unknown if they have done it)" - that is, we use a nullable boolean by each name.
Now, say my republican sponsor asked of me: Give me those lists where it is known that no republicans have done the compromising thing, so I can use it in my campaign.
What if I then gave him the lists where the repuclican names had null values? Later in the campaign, it would turn out - once the value for the bit did become known - that they had indeed done whatever it was and it would be embarrassing for the campaign.
If the sponsor had asked me: Give me those lists where no republican names are known to have done the compromising thing, then I could have given him the lists with the nulls, but not to satisfy his original request, which is the request that translates to what you ask of the database with the SQL query in question.
/Mats
Comment preview