I am new to SAP HANA and came across several issues when handling large tables (>2 billion rows).
As far as I understood, such tables need to be partitioned when written but apart from that on does not need to bother. Therefore I am quite surprised to run constantly into trouble when querying large tables. Regardless of performance-reasons to split such large tables, I am wondering whether there is really no other option.
To be more specific, I am working in the following setup:
- A table "T" with primary key "a" and roughly 2.2 billion rows
- A table "S" with several million rows
- A decomposition of "T" into "T1 and "T2" of similar size such that T=T1 union T2
On these tables the following statements fail:
1.
select T.a, S.a
from T
inner join S
on T.a=S.a
--> SAP DBTech JDBC: [2048]: column store error: search table error: [2598] column search intermediate result exceeds 2 billion rows limitation
2.
select b, count(a)
from (
select a, b from T1
union
select a, b from T2
);
--> SAP DBTech JDBC: [2048]: column store error: search table error: [34104] Intermediate result is too large in CalculationEngine.
3.
create column table MYTABLE as (
select a from T
) partition by roundrobin partitions 16;
--> [129]: transaction rolled back by an internal error: exception 10001001: Search result size limit exceeded: 2236482994
In general, it appears that large tables (>2 billion rows) can be handled whenever they are already partitioned and written. In contrast, when they are created during a query (e.g. by an outer join or a union), HANA mostly terminates with an errror.
Is this behaviour intended by HANA?
Is there an explicit rule what HANA can handle?
Are there hints or other routines to overcome the described issues?
I really appreciate any clarifications.
Cheers,
Mapkyc