Skip to content
This repository has been archived by the owner on Oct 8, 2019. It is now read-only.

Enhancement - SQLFlow JSON Output: Include information from INSERT and DELETE statements in procedures. #8

Open
fprima opened this issue Jul 10, 2019 · 14 comments
Assignees

Comments

@fprima
Copy link

fprima commented Jul 10, 2019

We consider this as lineage because in some cases we may have store procedures that produce a full refresh of data. That is doing a Truncate/Delete followed by and Insert. For these type of store procedures, we will have no lineage from the JSON. We have seen in the SQLFlow demo in the Sample SQl that relationships are included in the JSON for INSERT statements. That is why we were wondering why they are not included in the store proc JSON.

@sqlparser
Copy link
Owner

@fprima Is it possible to post a sample stored proc?

@fprima
Copy link
Author

fprima commented Jul 15, 2019

ProcedureExample.txt

@fprima fprima closed this as completed Jul 15, 2019
@fprima
Copy link
Author

fprima commented Jul 15, 2019

@sqlparser above please find a sample procedure.

@sqlparser
Copy link
Owner

@cnfree
add procedure in the dbobjs section of the JSON output in SQLFlow output:

            {
                "arguments": [
                    {
                        "coordinates": [
                            {
                                "x": 8,
                                "y": 24
                            },
                            {
                                "x": 8,
                                "y": 34
                            }
                        ],
                        "id": "2",
                        "name": "pv_num1",
						"datatype":"number",
						"inout":"in"
                    },
                    {
                        "coordinates": [
                            {
                                "x": 8,
                                "y": 36
                            },
                            {
                                "x": 8,
                                "y": 40
                            }
                        ],
                        "id": "4",
                        "name": "pv_num2",
						"datatype":"number",
						"inout":"inout"
                    }
                ],
                "coordinates": [
                    {
                        "x": 8,
                        "y": 15
                    },
                    {
                        "x": 8,
                        "y": 22
                    }
                ],
                "id": "1",
                "name": "student_department_mod",
                "type": "procedure"
            }

@sqlparser
Copy link
Owner

sqlparser commented Jul 20, 2019

@fprima what's kind of information do you need in the JSON output for delete and update statement in the procedure.

create or replace procedure student_department_mod(pv_num1 in number,pv_num2 in out number)
is
   lr_address 	   address%rowtype;
   
   
begin
  --insert
  insert into STUDENT (STUDENT_ID, NAME, AGE)
  values (9998646, 'JON', 34);

  -- select
  select *  into lr_address from ADDRESS where STUDENT_ID  = pv_num2;

   --update
  update DEPARTMENT
  set modified_date = sysdate
  where department_id  = 4;   
  
  --delete
  delete from EMPLOYEE where employee_id  = 9998;
end;

@cnfree
Copy link
Collaborator

cnfree commented Jul 20, 2019

@sqlparser
Implemented it.

@fprima
Copy link
Author

fprima commented Jul 26, 2019

@cnfree thanks for the update.

I checked the JSON output and see some issues. For example in the update statement we have "set modified_date = sysdate" and in the output I see Target: "column": "modified_date" and Source: "column": "modified_date" instead of sys date.
Also under dbobs I see branches for the procedure name/parameters, tables (STUDENT and ADDRESS), RESULT_OF_SELECT-QUERY, UPDATE-SET, but I do not see a branch for INSERT and DELETE. Also I was expecting an entry to the target table 'lr_address' and relations entry for 'where STUDENT_ID = pv_num2'.

Franco

@fprima
Copy link
Author

fprima commented Jul 26, 2019

@cnfree another item is for example In the JSON below, "clauseType" is missing for the STUDENT_ID column although type is 'frd'. It should have "clauseType = where" in source
"id": "5_0",
"type": "frd",
"effectType": "select",
"target": {
"id": "15_0",
"column": "STUDENT_ID",
"parentId": "14",
"parentName": "RESULT_OF_SELECT-QUERY",
"coordinates": [
{
"x": 12,
"y": 10
},
{
"x": 12,
"y": 11
}
]
},
"sources": [
{
"id": "13",
"column": "STUDENT_ID",
"parentId": "11",
"parentName": "ADDRESS",
"coordinates": [
{
"x": 12,
"y": 48
},
{
"x": 12,
"y": 58
}
]
}
]

@fprima
Copy link
Author

fprima commented Jul 26, 2019

@cnfree and lastly, were this changes committed to the package that we can download?

@cnfree
Copy link
Collaborator

cnfree commented Jul 27, 2019

@fprima The clauseType property is just used by join relation, but not frd relation.

@cnfree
Copy link
Collaborator

cnfree commented Jul 27, 2019

@cnfree and lastly, were this changes committed to the package that we can download?

@sqlparser please answer for this question.

@sqlparser
Copy link
Owner

Please update the DataFlowAnalyzer.java to support procedure input/output parameter. @cnfree

@cnfree
Copy link
Collaborator

cnfree commented Aug 22, 2019

@fprima please checkout https://github.com/sqlparser/gsp_demo and run DataFlowAnalyzer.java.

And the new version handled the sysdate keyword as a constant.

@techtronsoft
Copy link

去公司官网看了下,东西挺好用,想买个资料无奈官网一个电话都没留,邮件也没人回复

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants