There are various Data provisioning techniques for HANA and some of the recommended options are listed in the SAP HANA Master Guide. However, one of the customers I worked with wanted to leverage SAP PI to push data from external sources into HANA. One of the main reasons were to leverage the existing infrastructure and monitoring capabilities of SAP PI. This article covers most of the challenges we faced in several stages and I hope this is helpful.
Versions: SAP PI 7.1 (JVM 5) and SAP HANA Rev 85
Load Volume: 4 Million records
Load Schedule: Daily mornings
We decided take the PI – JDBC – HANA path.
On HANA, we had created a Schema and relevant tables which would be used during the Data acquisition process. The first challenge was “How to get PI talking to HANA”. We created a Stored Procedure in SAP HANA which would get the contents of a file and process it. The role of the Stored Procedure was to extract the records from the input and insert them into the target table.
Setup of JDBC Driver
There is a document on SCN (bit old) which refers to the steps for installing the driver.
http://scn.sap.com/docs/DOC-29889
We took JDBC Driver com.sap.ndb.studio.jdbc_1.85.0.jar file (from HANA Client Rev 85) and renamed it to com.sap.db.jdbc.Driver.jar. Added this jar file into the com.sap.aii.adapter.lib library and deployed in SAP PI via JSPM. This enabled the communication between PI and HANA via JDBC.
Using a CLOB data type to pass data from PI to HANA Stored Procedure
Though we were able to establish a communication from PI to HANA, we wanted to be able to pass large volume of records from PI. There would be multiple flat files amounting to 4 millions records.
Below are the data types which are supported by Receiver JDBC Adapter
INTEGER, BIT, TINYINT, SMALLINT, BIGINT, FLOAT, REAL, DOUBLE, NUMERIC, DECIMAL, CHAR, VARCHAR, STRING, LONGVARCHAR, DATE, TIME, TIMESTAMP, BINARY, VARBINARY, LONGVARBINARY, BLOB (input and output), CLOB (input and output), CURSOR (output; only in connection with the Oracle JDBC driver)
We decided to use CLOB as a means to transfer all the file contents. Would have been great if there was a way to pass an internal table to the Stored Procedure.
We still had errors when trying to use CLOB objects during the communication. A SAP Incident suggested to take rt.jar from the SAPJVM folder(sapjvm_5/jre/lib) and add it into the com.sap.aii.adapter.lib library. This still didn’t resolve our problem.
Finally we took JDBC Driver ngdbc.jar that comes with HANA client revision 94 from HANA Studio and renamed it as com.sap.db.jdbc.Driver.jar and updated it in the library as mentioned above. This version of jar has fixes to address the issue with CLOB data type.
Parsing of the PI Payload in the Stored Procedure
We performed the message mapping in PI such that PI concatenates all the records in the file with a record delimiter (#) and passes the entire payload to HANA Stored Procedure. The SQL within the Stored Procedure parses the CLOB based on record delimiter and inserts the records into the target table. We bumped into an Out-Of-Memory (OOM) exception in the Stored Procedure due to the way we wrote the SQL. The SQL couldn’t handle more than 1000 records within the CLOB. There is a bug when using CLOB variable. Reassigning values to the same CLOB variable doesn’t release the memory for some reasons. Hence, we had to change the approach and wrote the below SQL
-- Sample CLOB Data 'SAPSPAU06200100000001|2500000001#SAPSPAU06200100000002|2500000002#SAPSPAU06200100000003|2500000003# DECLARE V_RECORD_DELIMT VARCHAR(10) := '#'; DECLARE V_FIELD_DELIMT VARCHAR(10) := '|'; V_POS := 1; V_LOCATE := 0; V_FLAG := ''; V_LEN := LENGTH(:p_payload); --payload variable contains the actual CLOB data WHILE V_FLAG != 'X' DO V_LOCATE := LOCATE(:p_payload, :V_RECORD_DELIMT, :V_LOCATE + 1, 12); -- Locate the 12th occurrence of record delimiter IF V_LOCATE > 0 THEN V_PACKAGE := TRIM(SUBSTRING(:p_payload, :V_POS, (:V_LOCATE - :V_POS) + 1)); --Extract only 12 records in V_PACKAGE including record delimiter ELSE IF V_POS > V_LEN THEN --Reached End of File. The CLOB contains records in multiples of 12 BREAK; END IF; --Reached End of File. The CLOB does not contains records in multiples of 12. Extract the last few records into V_PACKAGE V_PACKAGE := TRIM(SUBSTRING(:p_payload, :V_POS)); V_FLAG := 'X'; -- End of Loop reached END IF; WHILE LENGTH(:V_PACKAGE) > 0 DO -- In the below Loop, extract individual records and fields to insert into Table V_RECORD := TRIM(SUBSTR_BEFORE(:V_PACKAGE, :V_RECORD_DELIMT)); -- Extract a record from CLOB I := 1; -- Reset Array Counter ARRAY_ID := ARRAY(); -- Clear the Array WHILE V_RECORD != '' DO V_FIELD := SUBSTR_BEFORE(:V_RECORD, :V_FIELD_DELIMT); -- Extract a field from Record V_LENGTH := LENGTH (:V_FIELD); IF V_LENGTH != 0 THEN ARRAY_ID[:I] := V_FIELD; ELSE --Check if this is the last field in the record or if further fields exist V_FIELD := SUBSTR_AFTER(:V_RECORD, :V_FIELD_DELIMT); V_LENGTH := LENGTH (:V_FIELD); IF V_LENGTH != 0 THEN -- There are more fields in the records ARRAY_ID[:I] := NULL; ELSE ARRAY_ID[:I] := :V_RECORD; END IF; END IF; V_RECORD := SUBSTR_AFTER(:V_RECORD, :V_FIELD_DELIMT); I := :I + 1; END WHILE; INSERT INTO <TARGET TABLE> VALUES( :ARRAY_ID[1], :ARRAY_ID[2] ); V_PACKAGE := SUBSTR_AFTER(:V_PACKAGE, :V_RECORD_DELIMT); END WHILE; V_POS := :V_LOCATE + 1; --skip the record delimiter END WHILE;
Tuning the Data Load
When we tested the interface with 4 million records, we noticed that there was a high CPU Utilization for SAP PI. To avoid this problem, we set the following parameters based on our existing environment.
- Integration Engine Configuration Parameter “EO_OUTBOUND_PARALLEL” : 20
- JDBC Communication Channel Settings: poolWaitingTime - 180000
- JDBC Communication Channel Settings: Maximum Concurrency - 4
- File Communication Channel: Record Structure – Payload,8000 (To ensure that each message has no more than 8000 records)
Timings:
We were able to achieve a load time of 25 minutes for 4 million records and at the same time keeping the PI/HANA CPU Utilizations under 30%
Things to consider:
Though PI message monitoring could be used to track errors on PI, we had to set up an Error handling framework (Tables/Exception Handling) to capture issues when Stored Procedure is executing. Reporting of these errors/Alerting was a pain point.There are limited SQL functions which can be applied on a CLOB compared to a STRING.
We also managed to come up with an alternative solution, where SAP PI invokes an ABAP Proxy and the proxy calls an ABAP Managed Database Procedures (AMDP Class) which again invokes the Stored Procedure created in the new Schema. There wouldn’t be any CLOB objects used in this scenario as AMDP Class could pass an internal table to the Stored Procedure and there wouldn’t be any need to parse objects. Since we have a SoH system, we could have tried to explore this path. But didn’t have the time and was again not sure about the performance.
More info on AMDP: http://scn.sap.com/docs/DOC-51612