Pages

Wednesday, April 18, 2012

Debuggin with sql trace

Last week one of our Pro*C programs stopped working. The mission of this program is to read data from one table and copy some columns to other table's columns. Every few months the source table is  filled with sqlldr, which is done using an OS user.
It seems that one of those times where sqlldr was run,  the locale os environment was not properly setup, and the data was entered in the table with the wrong format (ISO8859, instead of UTF-8).

Now when the pro*c was ran, it stop with the following message:
ORA-29275: partial multibyte character

WRN-ORACERR: Error occurred in file "svpceeb.pc" at line 626
WRN-ERRSTMT: Following statement was last statement parsed:
    select count(*)  ,SPRIDEN_ID  from SPRIDEN ,SPBPERS where (((((((SPRID


As you see, the error shows the last parsed instructions. My first thought? Piece of cake, I will add a printf statment to print some identification of the record been processed, so it is easy to know exactly which one failed.
 I failed miserably on my attempt to get the processed row to print. This is a ICU environment and the programs use a lot of _TM macros. I tried printing a single variable using the following statement:
  tmprintf(&tmBundle,_TMC("\n string text, {0}”),variable); where variable was one of the bind variables used by the select count clause. 
The tmprinf always caused a memory fault when I tried. I'll take a deeper look later, but as usually happens in production you don't have the luxury of time, you need to fix the problem as soon as possible. 

Having spent some time trying to make tmprintf to work, I gave up and moved to the sql trace facility.

I added the following code, just before the failing sql statement:


  EXEC SQL
        ALTER SESSION SET tracefile_identifier = 'svpceeb_debug';
    POSTORA;
    EXEC SQL
        ALTER SESSION SET sql_trace = true;
    POSTORA;
    EXEC SQL
        ALTER SESSION SET EVENTS '10046 trace name context forever, level 12';
    POSTORA;


And this one after the failing sql statement: 

 EXEC SQL
        ALTER SESSION SET sql_trace = false;
    POSTORA;

 In order to be able to trace to a file, your username must have the 'ALTER SESSION' privilege. So login into sqlplus, sqldeveloper, toad or as a user with the right privileges to issue a : GRANT ALTER SESSION TO <USERNAME>




Now go and run the Pro*C using the USERNAME you just gave the privilege just now.


While the program runs, oracle will log, bind variables, wait events and sql tracing.  It is looking at the trace file where you can get the information of what row was the last retrieved successfully from the database. 
A useful trick to review the file:  In my case there were more than 70,000 rows processed before the program aborted. That makes for a huge log file not very easy to open.  So I just get the cursor ID of the query I knew was failing. It is something like "BINDS #46953007798952:"   where 469530077798952 is the id of the CURSOR of interest.
   so I did this:    grep -A5 "BINDS #46953007798952:"  trace_file.trc  | grep "value=" 
The output of the above command is a list one column that can be easily queried using sqldeveloper and determine which row forced the program to abort.
You must be careful that us use the same query in sqldeveloper so the order is the same order you get within the trace file. I did not have to play with the execution plan but I suppose that in some situation that would be the case.