Back Home Up Next
| |
Computed fields with conditional statements.
It's a typical question whether IB supports the CASE
statement. Currently, it doesn't. Other questions are if a CASE or IF statement
can be embedded inside aggregate functions like SUM or AVG, like MsSql allows.
The answer is no, since those functions accept a field name or an expression,
but not conditional statements. Moreover, computed fields don't accept
conditional statements. Several people have attempted to put IFs inside computed
fields, without success.
First, let me say that server-calculated expressions are not exactly the same
than computed fields. A server-calculated expression occurs each time you write
something like
select b, h, b*h*1.0/2.0 as area from triangles
being "area" the server-calculated field. This is to distinguish it
from client-calculated fields, performed in procedural code in the client
application once the record set has been obtained from the server. Computed
fields, on the other hand, aren't ANSI SQL feature; they are proprietary
constructions of IB and are included as one field more of the table, in the
definition of the table. Therefore, they can be defined as server-calculated
fields that are made part of the table definition, like another field. Now, the
problem is how to provide conditional processing inside the
COMPUTED BY (expr) clause accepted by IB. Usually, "expr" doesn't
allow the SELECT statement, so it's constrained to what a server-calculated
expression can produce.
Over time, three types of workarounds have been presented. I will examine
them briefly and explain the weaknesses:
- The first one is to write a VIEW including UNION ALL statements that glue
the different SQL sentences according to each condition:
create view getmax(f_a, f_b, f_max) as
select a, b, a from tbl where a>=b
UNION ALL
select a, b, b from tbl where a<b
Of course, those sentences should use the same number and type of fields to
be glued by UNION or UNION ALL. The latter means removing overhead if you know
that the different sentences produce disjoint results, hence the engine
doesn't lose time and resources looking for records that are complete
duplicates (as it would do with UNION). Probably a SQL puritan will choose
this solution. The problem is that the engine should filter by the condition
that separates the subparts only to allow different processing, because
classic SQL didn't include procedural instructions. Depending on the
condition, the filtering can be expensive.
- The second workaround is to write a procedure. This is the recommended and
preferred way to go of IB developers, since IB supports to so-called selectable
stored procedures, that can be invoked in a SELECT statement in the
place where a table name would occur, but including input parameters if
necessary. A procedure can take more complex conditions without too much
overhead, because it stores fields in temporal variables and hence, it can
work on each record as it comes, without a filter that's needed in the UNION
ALL case. Depending on the driver and application connecting to IB for the
needed task, using a proc instead of a table might be cumbersome or tricky
(this seems to be the case with some external reporting facilities, for
example). Also, a proc should be slower than selecting directly from a
table.
- The third workaround is to write a UDF expression to produce the result.
This involves playing with sign(), abs(), substr() and other functions that
come with the default UDF library included with IB. For example, in the case
shown above, the formula would be:
create table t(a int, b int,
f_max computed by (div(sign(a-b)+1,2)*a+div(sign(b-a)+1,2)*b+(1-abs(sign(a-b)))*a));
In the example, I used div() because it doesn't depend on the dialect for
the result and I need an integral division. Usually, the code looks ugly and
obscure to the casual reader and even for the author after time has passed.
I've posted to newsgroups solutions like those to show the trickery of UDFs,
but certainly, the operations involved doesn't contribute to the readability
of the code.
Given those known tricks, I wasn't too surprised that people usually said
that conditional statements in computed fields are a chimera. However, I was
surprised that no one paid attention to my claim that calling a stored procedure
would do the trick. Overkill? Convoluted? Impossible? Let's see: first, we build
a selectable stored procedure that handles all the conditions we want, provided the parameters:
set term ^;
create procedure proc_example(a int, b int)
returns (c int)
as begin
if (a is null and b is null)
then c = 0;
else if (a is null)
then c = b;
else if (b is null)
then c = a;
else c = a * 2 + b * 3;
if (c < 0)
then c = -c;
suspend;
end ^
set term ;^
Second, we need to build the table definition, including
the computed field. Here come the rough edges that might have caused some people
to give up:
- To produce a result in the computed field, we need the SELECT/FROM syntax.
- To be able to put a SELECT statement inside the computed field, we need
the a trick that works on other places: an extra pair of parenthesis.
- To be able to pass the table's current record's field values to the
procedure, we need to prefix those field names with the table name. I know
it sounds bizarre, but this is required in practice.
create table table_computed(
f1 int, f2 int,
f3 computed by ((select c
from proc_example(table_computed.f1, table_computed.f2))));
It's time to test the functionality with a couple of insertions that make the
proc to exercise all the conditional branches it has:
insert into table_computed(f1,f2) values(null,null);
insert into table_computed(f1,f2) values(null,0);
insert into table_computed(f1,f2) values(0,null);
insert into table_computed(f1,f2) values(0,0);
insert into table_computed(f1,f2) values(0,1);
insert into table_computed(f1,f2) values(1,0);
insert into table_computed(f1,f2) values(1,1);
insert into table_computed(f1,f2) values(-1,-1);
Now, do the lazy select * from table_computed and it will
produce:
| <null> |
<null> |
0 |
| <null> |
0 |
0 |
| 0 |
<null> |
0 |
| 0 |
0 |
0 |
| 0 |
1 |
3 |
| 1 |
0 |
2 |
| 1 |
1 |
5 |
| -1 |
-1 |
5 |
The purpose of this document is to show that sometimes, IB offers more
functionality than the casual reviewer sees in the engine. If you really prefer
to have a computed field with conditional statements, you can have one, with the
native functionality of the engine.
Important: short after releasing this document, I was faced by an apparent
tough question: a person created a table with a computed field driven by a
stored procedure but also that table was joined with another one and the whole
construction put inside a VIEW. The net result is that the VIEW doesn't show the
computed field. My first public reaction was "send me your metadata
privately and I will take a peek". However, some hours later, without
checking whether I received the metadata or not, I thought I had remembered the source of the
problem: if you
read LangRef, you'll see that it states explicitly that VIEWS cannot be based on
procedures. I would have supposed that IB would check for such restriction. For
example, I have a selectable procedure named GET_DATES that receives one
parameter. If I try to execute this DDL statement
create view v_getdates as select * from get_dates(current_timestamp)
I get a nice error message, stating that the table "GET_DATES" was not
found and if I try to put the correct call, with the parameter,
create view v_getdates as select * from get_dates(current_timestamp)
then the left parenthesis is rejected. Okay, IB enforces the restriction.
However, to my surprise, I created a VIEW based on such table and IB didn't
complain:
create view view_computed as select * from table_computed
It seems that the calculated field was something IB was not prepared to find and
it passed as an slippery sentence. Come on, dear gurus, where's the limit of the
formal restrictions and where's the limit of the real possibilities of the
engine? A VIEW is not allowed to use a procedure (as seen on the docs and
tested) but a procedure camouflaged in a COMPUTED field is accepted and it runs!
To ensure I was not misinterpreting visually the results, I compared
select f1,f2,f3 from table_computed
union all
select f1,f2,f3 from view_computed order by 1
against
select f1,f2,f3 from table_computed
union
select f1,f2,f3 from view_computed order by 1
and the former returned the double of records than the latter. So, the computed
field is working inside the VIEW. I went further and created a second VIEW that
does a JOIN of the table with itself and finally a third VIEW that does a UNION
of the table with itself. All of them show the computed field.
©Claudio Valderrama, Dec-2000.
|