DBA Blogs
Design decision on database tables oracle DB
SQL developer hanging
Social Sign-In with Azure
Oracle returns default value for column on rows inserted before the column was added
XMLTYPE returning unknown special character
Sql Plan Baseline
Deleting duplicate records without using rowid and rownum
Table Design
ORA-04063 Errors During Datapatch Rollback of Oracle Patch 30763851
I am working on an Oracle 19c database running on a RHEL 7 VM. I have been trying to roll back this patch:
Patch 30763851 IMPDP 11.2 TO 18C OR HIGHER HITS ORA-904 WHEN TABLES HAVE EXTENDED STATISTICS
I have it sitting on top of 19.5
Database Release Update: 19.5.0.0.191015 (30125133)
The Opatch rollback command ran fine, but Datapatch threw these errors:
[2025-10-22 18:50:15] -> Error at line 11329: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_OPQTYPE_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11331: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_OPQTYPE_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11333: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_OPQTYPE_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11335: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_OPQTYPE_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11337: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_OPQTYPE_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11339: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_OPQTYPE_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11341: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_OPQTYPE_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11343: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_OPQTYPE_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11355: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_P2TPARTCOL_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11357: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_P2TPARTCOL_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11363: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_SP2TPARTCOL_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11365: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_SP2TPARTCOL_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11381: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_COLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11383: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_COLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11385: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_COLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11387: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_COLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11389: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_PCOLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11391: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_PCOLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11393: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_PCOLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11395: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_PCOLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11397: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_P2TCOLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11399: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_P2TCOLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11401: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_SP2TCOLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11403: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_SP2TCOLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11405: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_COLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11407: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_COLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11409: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_PCOLUMN_VIEW" has errors"
[2025-10-22 18:50:15] -> Error at line 11411: script rdbms/admin/dpload.sql
[2025-10-22 18:50:15] - ORA-04063: view "SYS.KU$_PCOLUMN_VIEW" has errors"
I checked DBA_OBJECTS, and all the SYS objects are VALID. I tried querying one of the views and it worked fine. So, I went to My Oracle Support, our Oracle database support site, and searched for ORA-04063 and one of the view names and found nothing. A Google search also came up empty. I tried just ignoring it but that didn’t work. My whole goal in doing this was to apply the October 2025 patches that just came out this week. But because the SQL patch registry indicated that patch 30763851 rolled back with errors, every time I applied a new patch it would try to roll 30763851 back first and error again. Here is what DBA_REGISTRY_SQLPATCH looked like after two failed rollback attempts:
INSTALL_ID PATCH_ID PATCH_TYPE ACTION STATUS
---------- ---------- ---------- --------------- --------------
1 30125133 RU APPLY SUCCESS
2 30763851 INTERIM APPLY SUCCESS
3 30763851 INTERIM ROLLBACK WITH ERRORS
3 30763851 INTERIM ROLLBACK WITH ERRORS
4 30763851 INTERIM ROLLBACK WITH ERRORS
4 30763851 INTERIM ROLLBACK WITH ERRORS
Each rollback attempt tried twice so I have four failures with two rollback attempts.
I opened a case with Oracle support just in case this was a known issue that wasn’t available for me to find on my own. Sometimes that happens. But while waiting on Oracle I kept trying to fix it myself.
The errors refer to $ORACLE_HOME/rdbms/admin/dpload.sql which I think reloads datapump after some change. It runs catmetviews.sql and catmetviews_mig.sql which have the CREATE VIEW statements for the views getting errors, like SYS.KU$_OPQTYPE_VIEW. But the code in catmetviews_mig.sql wasn’t straightforward. I imagined running some sort of trace to see why the script was throwing the ORA-04063 errors, but I never had to take it that far.
At first all this stressed me out. I thought, “I can’t back out this patch. I will never be able to patch this database to a current patch level.” Then I chilled out and realized that if it was a problem with Oracle’s code, they had to help me back out 30763851. But it might take some time to work through an SR with Oracle.
But what if it wasn’t an issue with Oracle’s code but something weird in our environment? I didn’t think it indicated a real problem, but there were some weird messages coming out that I am used to seeing. They were from triggers that come with an auditing tool called DB Protect. They were throwing messages like this:
[SYS.SENSOR_DDL_TRIGGER_A] Caught a standard exception: aliasId=100327, error=-29260, message="ORA-29260: network error: TNS:no listener"
We are used to seeing these errors when we do DDL but prior to this it didn’t cause any actual problems. We had already decommisioned the DB Protect tool but had not cleaned up the triggers. Dropping SYS.SENSOR_DDL_TRIGGER_A eliminated the ORA-04063 errors.
Probably no one will ever encounter this same issue, but I thought I would document it. If you have the same symptoms and you are not using DB Protect any more, do these commands:
DROP TRIGGER SYS.SENSOR_DDL_TRIGGER_A;
DROP TRIGGER SYS.SENSOR_DDL_TRIGGER_B;
I think the A trigger was the problem, but we don’t need either one.
Anyway, this post is just so someone who searches for ORA-04063 and one of the views will find this information and drop the triggers if they have them. It’s a long shot but might as well document it for posterity and for me.
Bobby
The "Retrieval" for RAG using Semantic Search
Reusing the sample code from my previous demo, I build a table with 130 sentences describing AI Tools and then use SQL to run the sort of queries that you'd expect RAG to use against an "internal knowledge source".
Here's a preview of the sentences loaded :
SQL> select count(*) from my_data_source;
COUNT(*)
----------
130
SQL> select my_sentence from my_data_source fetch first 5 rows only;
MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
Adobe Firefly: A family of generative AI models integrated into Adobe's Creative Cloud suite. It is used for professional image editing and graphic
design
AIVA (Artificial Intelligence Virtual Artist): An AI composer that generates music in various styles for content creators and brands
Amazon CodeWhisperer: An AI coding assistant from Amazon Web Services that provides contextual code recommendations
Canva Magic Media: An AI image generator integrated into the user friendly design platform. It is used for creating visuals for social media and pr
esentations
ChatGPT: An AI chatbot from OpenAI. It can engage in human like conversations, write code, summarize text, and create content such as stories and e
ssays
SQL> select my_sentence from my_data_source where id > 126 ;
MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
Zubtitle.com captions for video
Cohere.ai large language models
Grok.com personal AI assistant
Claude.ai advanced AI chatbot
SQL>
The embeddings are then computed and loaded into the target table using the same query (the HuggingFace model ALL_MINILM_L12_V2_AUGMENTED has already been loaded with the ONNX RunTime engine with the same code as in the previous blog post) :
insert into my_data_vectors
select ds.id as id, my_sentence, to_vector(et.embed_vector) sentence_vector
from
my_data_source ds,
dbms_vector_chain.utl_to_embeddings(
dbms_vector_chain.utl_to_chunks(dbms_vector_chain.utl_to_text(ds.my_sentence), json('{"normalize":"all"}')),
json('{"provider":"database", "model":"ALL_MINILM_L12_V2_AUGMENTED"}')) t,
JSON_TABLE(t.column_value, '$[*]' COLUMNS (embed_id NUMBER PATH '$.embed_id', embed_data VARCHAR2(4000) PATH '$.embed_data', embed_vector CLOB PATH '$.embed_vector')) et
/
I now test a few "Retrieval" queries thus : (Note how the queries return different results with smaller VECTORE_DISTANCE as I refine them)
SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : Video
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
2 :text_variable := '&text_input';
3 SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING :text_variable as data) into :query_vector;
4 END;
5 /
old 2: :text_variable := '&text_input';
new 2: :text_variable := 'Video';
PL/SQL procedure successfully completed.
SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
2 FROM my_data_vectors
3 ORDER BY 2
4 FETCH FIRST 3 ROWS ONLY;
MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
Clipchamp.com quick video creation
4.635E-001
Lumen5.com AI powered social video
5.277E-001
Synths.video convert blogs into videos
5.39E-001
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------
SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : What AI Tool can I use to edit Videos ?
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
2 :text_variable := '&text_input';
3 SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING :text_variable as data) into :query_vector;
4 END;
5 /
old 2: :text_variable := '&text_input';
new 2: :text_variable := 'What AI Tool can I use to edit Videos ?';
PL/SQL procedure successfully completed.
SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
2 FROM my_data_vectors
3 ORDER BY 2
4 FETCH FIRST 3 ROWS ONLY;
MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
Wisecut.video auto edit with AI
2.62E-001
Runwayml.com AI video editing
2.777E-001
Gling.ai YouTube video editor
3.413E-001
SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------
SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : image
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
2 :text_variable := '&text_input';
3 SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING :text_variable as data) into :query_vector;
4 END;
5 /
old 2: :text_variable := '&text_input';
new 2: :text_variable := 'image';
PL/SQL procedure successfully completed.
SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
2 FROM my_data_vectors
3 ORDER BY 2
4 FETCH FIRST 3 ROWS ONLY;
MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
VanceAI.com image enhancement
5.268E-001
Canva Magic Media: An AI image generator integrated into the user friendly design platform. It is used for creating visuals for social media and pr
esentations
5.459E-001
PicWish.com photo editing
5.696E-001
SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------
SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : How can I use AI to edit Images ?
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
2 :text_variable := '&text_input';
3 SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING :text_variable as data) into :query_vector;
4 END;
5 /
old 2: :text_variable := '&text_input';
new 2: :text_variable := 'How can I use AI to edit Images ?';
PL/SQL procedure successfully completed.
SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
2 FROM my_data_vectors
3 ORDER BY 2
4 FETCH FIRST 3 ROWS ONLY;
MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
Hotpot.ai AI image editing
2.805E-001
Runwayml.com AI video editing
3.711E-001
Wisecut.video auto edit with AI
3.921E-001
SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------
SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : Can I chat with an AI ?
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
2 :text_variable := '&text_input';
3 SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING :text_variable as data) into :query_vector;
4 END;
5 /
old 2: :text_variable := '&text_input';
new 2: :text_variable := 'Can I chat with an AI ?';
PL/SQL procedure successfully completed.
SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
2 FROM my_data_vectors
3 ORDER BY 2
4 FETCH FIRST 3 ROWS ONLY;
MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
Claude.ai advanced AI chatbot
2.989E-001
Claude.ai conversational assistant
3.828E-001
Grok.com personal AI assistant
4.025E-001
SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------
SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : coding
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
2 :text_variable := '&text_input';
3 SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING :text_variable as data) into :query_vector;
4 END;
5 /
old 2: :text_variable := '&text_input';
new 2: :text_variable := 'coding';
PL/SQL procedure successfully completed.
SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
2 FROM my_data_vectors
3 ORDER BY 2
4 FETCH FIRST 3 ROWS ONLY;
MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
Amazon CodeWhisperer: An AI coding assistant from Amazon Web Services that provides contextual code recommendations
5.543E-001
Replit.com write and run code
5.548E-001
GitHub Copilot: An AI coding assistant that provides code suggestions and autocompletion within an IDE
5.677E-001
SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------
SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : Tell me of a Coding Assistant
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
2 :text_variable := '&text_input';
3 SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING :text_variable as data) into :query_vector;
4 END;
5 /
old 2: :text_variable := '&text_input';
new 2: :text_variable := 'Tell me of a Coding Assistant';
PL/SQL procedure successfully completed.
SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
2 FROM my_data_vectors
3 ORDER BY 2
4 FETCH FIRST 3 ROWS ONLY;
MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
GitHub Copilot: An AI coding assistant that provides code suggestions and autocompletion within an IDE
4.306E-001
Amazon CodeWhisperer: An AI coding assistant from Amazon Web Services that provides contextual code recommendations
4.702E-001
Replit: An online IDE with an integrated AI assistant that helps with code generation, debugging, and project creation
4.734E-001
SQL>
SQL>
Reflections from Oracle AI World 2025: Innovation, Community, and the Road Ahead
Discover the major announcements from Oracle AI World 2025, including the transformation to Oracle Database 26ai, GoldenGate's AI-ready capabilities, and the game-changing OCI GoldenGate on Azure integration. Learn how Oracle is architecting AI into the core of data management and why these innovations position organizations for the AI revolution.
The post Reflections from Oracle AI World 2025: Innovation, Community, and the Road Ahead appeared first on DBASolved.
Oracle APEX, OCI, Ollama at work thanks to OllAPEX for the #JoelKallmanDay
Today is the day dedicated to the late Joel Kallman, a mentor and friend, so I thought it would be nice to show off my APEX application hosted on OCI working with LLMs run via the Ollama server located at home.
It's an interesting exercise because one learns a lot about how these "things" work, the strong and weak points of each model, which one works best for a certain task, which one has the best "vision" capability, which one creates the best embeddings.
So, here are a few short videos showing some "vibe" coding that I'll need to revise tomorrow, the analysis of two versions of the same procedure created by the same LLM, then a "live" comparison showing how other models answer the same question differently.
So far, so good.
#JoelKallmanDay #orclAPEX, #AI
AI / LLM : Semantic Similarity Search using Hugging Face with Vectors in an Oracle 23ai Database
As I have recently gone through a course on LLMs for Chatbots and RAG on OCI (Oracle Cloud Infrastructure), I thought about putting together a quick demo that uses
A The Hugging Face all-MiniLM-L12-v2 model (with some augmentations done by Oracle for download)
B The ONNX (Open Neural Network Exchange) Runtime Engine that runs in Oracle 23ai (and, yes, in the 23ai Free downloadable edition, not requiring the Oracle Cloud)
C Vectors stored in an Oracle Database using the Vector Datatype
D Use the Hugging Face model to generate Vectors for Similarity / Semantic Searches
but DOES NOT USE Oracle Cloud Infrastructure. (To use OCI for Embedding -- generating "Vectors" from "Tokens", I have to book a Dedicated AI Cluster to "Fine Tune" an existing model, so I am currently staying away from this expensive method)
[Similarly, I cannot currently build a Generative AI as I have to setup a model and server resources]
The sequence is :
1. Setup an Oracle 23ai database (either in the Oracle Cloud OR on an Exadata machine OR the Free edition for On-Premises testing)
2. Create a User to hold the source data and vectors
3. Import the augmented Hugging Face model
4. Load the Source Data as words or sentences or chunks -- here I use a few sentences
5. Compute Vectors for the target column
6. Optional but recommended : Build Vector Index -- for this demo with very few rows I haven't built a Vector Index but if and/or when I do build a demo with a few hundred or thousand sentences or chunks of data, I will create an Index on the Vectors
7. Use Oracle's VECTOR_DISTANCE Function for a Semantic Similarity Search
So here goes :
Using the 23ai FREEPDB downloadable VM, I created the user with these grants : grant connect to vector_demo identified by password ; alter user vector_demo default tablespace users ; alter user vector_demo quota unlimited on users ; grant DB_DEVELOPER_ROLE to vector_demo ; grant CREATE MINING MODEL to vector_demo; CREATE OR REPLACE DIRECTORY DM_DUMP as '/tmp/Load_Model'; -- on the Linux VM GRANT READ ON DIRECTORY dm_dump TO vector_demo; GRANT WRITE ON DIRECTORY dm_dump TO vector_demo;
Download and Import the Model :
Download the ZIP file from https://adwc4pm.objectstorage.us-ashburn-1.oci.customer-oci.com/p/VBRD9P8ZFWkKvnfhrWxkpPe8K03-JIoM5h_8EJyJcpE80c108fuUjg7R5L5O7mMZ/n/adwc4pm/b/OML-Resources/o/all_MiniLM_L12_v2_augmented.zip It also contains a README Place the extracted ONNX file on the Linux Server at /tmp/Load_Model SQL> !ls -l /tmp/Load_Model total 249932 -rw-r--r--. 1 oracle oinstall 122597346 Jul 15 2024 all_MiniLM_L12_v2_augmented.zip -rw-r--r--. 1 oracle oinstall 133322334 Jul 15 2024 all_MiniLM_L12_v2.onnx -rw-r--r--. 1 oracle oinstall 4232 Jul 15 2024 README-ALL_MINILM_L12_V2-augmented.txt SQL> If necessary, delete the previously imported model from the database (logged in to the SQL command line as vector_demo) : exec DBMS_VECTOR.DROP_ONNX_MODEL(model_name => 'ALL_MINILM_L12_V2_AUGMENTED', force => true); Import the model file : SQL> EXECUTE DBMS_VECTOR.LOAD_ONNX_MODEL(- directory=>'DM_DUMP',- file_name=>'all_MiniLM_L12_v2.onnx',- model_name=>'ALL_MINILM_L12_V2_AUGMENTED'- metadata => JSON('{"function" : "embedding", "embeddingOutput" : "embedding", "input": {"input": ["DATA"]}}')- ); The "model_name" is the User Defined Name (i.e. defined by the DBA). In many online demos and the online demo, I see the model name specified as "doc_model". I preferred to use the name "ALL_MINILM_L12_V2_AUGMENTED". You could use any meaningful name Verify the imported model with these queries : SQL> l 1 SELECT model_name, attribute_name, attribute_type, data_type, vector_info 2 FROM user_mining_model_attributes 3 WHERE model_name = 'ALL_MINILM_L12_V2_AUGMENTED' 4* ORDER BY ATTRIBUTE_NAME SQL> / MODEL_NAME ATTRIBUTE_NAME ATTRIBUTE_TY DATA_TYPE -------------------------------- ---------------- ------------ ---------------- VECTOR_INFO -------------------------------------------------------- ALL_MINILM_L12_V2_AUGMENTED DATA TEXT VARCHAR2 ALL_MINILM_L12_V2_AUGMENTED ORA$ONNXTARGET VECTOR VECTOR VECTOR(384,FLOAT32) SQL> SQL> l 1 SELECT MODEL_NAME, MINING_FUNCTION, ALGORITHM, 2 ALGORITHM_TYPE, MODEL_SIZE 3 FROM user_mining_models 4 WHERE model_name = 'ALL_MINILM_L12_V2_AUGMENTED' 5* ORDER BY MODEL_NAME SQL> / MODEL_NAME MINING_FUNCTION ALGORITHM ALGORITHM_ MODEL_SIZE -------------------------------- ------------------------------ ------------------------------ ---------- ---------- ALL_MINILM_L12_V2_AUGMENTED EMBEDDING ONNX NATIVE 133322334 SQL> SQL>SELECT * FROM DM$VMALL_MINILM_L12_V2_AUGMENTED ORDER BY NAME ; NAME VALUE ------------------ ------------------------------------------------ Graph Description Graph combining tokenizer and main_graph tokenizer main_graph Graph Name tokenizer_main_graph Input[0] input:string[?] Output[0] embedding:float32[?,384] Producer Name onnx.compose.merge_models Version 1 6 rows selected. SQL> SQL> SELECT * FROM DM$VPALL_MINILM_L12_V2_AUGMENTED ORDER BY NAME; NAME VALUE ------------------ ------------------------------------------------ embeddingOutput embedding function embedding SQL> SQL> SELECT * FROM DM$VJALL_MINILM_L12_V2_AUGMENTED; METADATA -------------------------------------------------------------------------------- {"function":"embedding","embeddingOutput":"embedding","input":{"input":["DATA"]} SQL>
SQL> SELECT VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'The quick brown fox jumped' as DATA) AS embedding;
EMBEDDING
------------------------------------------------------------------------------------------------------------------------------------
[1.65517051E-002,3.19098569E-002,-1.96293015E-002,-3.56926955E-002,
SQL>
SQL> SELECT VECTOR_DIMS(VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'The quick brown fox jumped' as DATA)) as Dimensions_Count;
DIMENSIONS_COUNT
----------------
384
SQL>
note : The output is actually a list of 384 dimensions, I copied the whole output from SQL Developer put it into a Python List and
then did a LEN(list) to count the number of elements
SQL> drop table my_data_source purge;
Table dropped.
SQL> create table my_data_source(id number primary key, my_sentence varchar2(4000));
Table created.
SQL> insert into my_data_source values (1,'The quick brown fox jumped over the lazy dog');
1 row created.
SQL> insert into my_data_source values (2,'she sells sea-shells at the seashore');
1 row created.
SQL> insert into my_data_source values (3,'the fox and dog are brown friends');
1 row created.
SQL> insert into my_data_source values (4,'the elephant knows the fox and dog');
1 row created.
SQL> insert into my_data_source values (5,'the fox, dog and elephant live together');
1 row created.
SQL> insert into my_data_source values (6,'aeroplanes fly in the sky');
1 row created.
SQL> commit;
Commit complete.
SQL>
SQL> select * from my_data_source;
ID
----------
MY_SENTENCE
------------------------------------------------------------------------------------------------------------------------------------
1
The quick brown fox jumped over the lazy dog
2
she sells sea-shells at the seashore
3
the fox and dog are brown friends
4
the elephant knows the fox and dog
5
the fox, dog and elephant live together
6
aeroplanes fly in the sky
6 rows selected.
SQL>
Create the Table to hold the computed Vectors, using the model that has been loaded
SQL> drop table my_data_vectors;
Table dropped.
SQL> create table my_data_vectors (id number primary key, my_sentence varchar2(4000), sentence_vector vector);
Table created.
SQL>
SQL> insert into my_data_vectors
select ds.id as id, my_sentence, to_vector(et.embed_vector) sentence_vector
from
my_data_source ds,
dbms_vector_chain.utl_to_embeddings(
dbms_vector_chain.utl_to_chunks(dbms_vector_chain.utl_to_text(ds.my_sentence), json('{"normalize":"all"}')),
json('{"provider":"database", "model":"ALL_MINILM_L12_V2_AUGMENTED"}')) t,
JSON_TABLE(t.column_value, '$[*]' COLUMNS (embed_id NUMBER PATH '$.embed_id', embed_data VARCHAR2(4000) PATH '$.embed_data', embed_vector CLOB PATH '$.embed_vector')) et;
2 3 4 5 6 7 8
6 rows created.
SQL>
SQL> commit
2 /
Commit complete.
SQL> select * from my_data_vectors;
ID
----------
MY_SENTENCE
------------------------------------------------------------------------------------------------------------------------------------
SENTENCE_VECTOR
------------------------------------------------------------------------------------------------------------------------------------
1
The quick brown fox jumped over the lazy dog
[1.70537992E-003,5.95331714E-002,-2.32173726E-002,3.02353837E-002,
2
she sells sea-shells at the seashore
[1.2790652E-002,5.97022101E-002,8.24511051E-002,3.93822305E-002,
3
the fox and dog are brown friends
[7.67209902E-002,-7.82399923E-002,-2.40087509E-002,-2.37390138E-002,
4
the elephant knows the fox and dog
[4.04452085E-002,-4.15055361E-003,-1.71641614E-002,2.63017584E-002,
5
the fox, dog and elephant live together
[9.44276601E-002,-4.74944711E-002,2.31287945E-002,2.9055763E-002,
6
aeroplanes fly in the sky
[1.1147093E-001,2.18752325E-002,-4.58196662E-002,-2.64751501E-002,
6 rows selected.
SQL>
Do a Semantic Similarity Search for certain phrases :
SQL> l
1 SELECT VECTOR_DISTANCE(
2 VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'The quick brown fox jumped' as DATA),
3 VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'The quicker yellow fox jumped' as DATA),
4* COSINE) as Vector_Distance
SQL> /
VECTOR_DISTANCE
---------------
1.76127E-001
SQL>
SQL> l
1 SELECT VECTOR_DISTANCE(
2 VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'The quick brown fox jumped' as DATA),
3 VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'seashells' as DATA),
4* COSINE) as Vector_Distance
SQL> /
VECTOR_DISTANCE
---------------
9.81778E-001
SQL>
SQL> l
1 SELECT VECTOR_DISTANCE(
2 VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'The quick brown fox jumped' as DATA),
3 VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'moon elephants' as DATA),
4* COSINE)as Vector_Distance
SQL> /
VECTOR_DISTANCE
---------------
7.51297E-001
SQL>
--- "the quicker yellow fox jumped" is not very far away from "the quick brown fox jumped"
--- "moon elephants" is far away from the "the quick brown fox jumped".
--- "seashells" is much further away.
--- likely because the prebuilt model recognises tnat "fox" and "elephant" have some similarity being animals
SQL> l
1 SELECT VECTOR_DISTANCE(
2 VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'The quick brown fox jumped' as DATA),
3 VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'houshold electricals' as DATA),
4* COSINE)as Vector_Distance
SQL> /
VECTOR_DISTANCE
---------------
9.97313E-001
SQL>
-- "household" electricals is further away
SQL> l
1 SELECT id , my_sentence, vector_distance(sentence_vector , VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'sea-shells' as DATA), COSINE) as Vector_Distance
2 FROM my_data_vectors
3 ORDER BY vector_distance(sentence_vector , VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'sea-shells' as DATA), COSINE)
4* FETCH FIRST 2 ROWS ONLY
SQL> /
ID MY_SENTENCE VECTOR_DISTANCE
------------ ------------------------------------------------ ---------------
2 she sells sea-shells at the seashore 3.65937E-001
6 aeroplanes fly in the sky 8.43676E-001
SQL>
SQL> l
1 SELECT id , my_sentence, vector_distance(sentence_vector , VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'she sells seashells' as DATA), COSINE) as Vector_Distance
2 FROM my_data_vectors
3 ORDER BY vector_distance(sentence_vector , VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'she sells seashells' as DATA), COSINE)
4* FETCH FIRST 2 ROWS ONLY
SQL> /
ID MY_SENTENCE VECTOR_DISTANCE
------------ ------------------------------------------------ ---------------
2 she sells sea-shells at the seashore 1.46406E-001
6 aeroplanes fly in the sky 9.26212E-001
SQL>
--- If I refine my query from "sea-shells" to "she sells seashells", I can get a better match, with the Vector Distance dropping from 3.6*10^-1 to 1.4*10^-1
SQL> l
1 SELECT id , my_sentence, vector_distance(sentence_vector , VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'quick fox' as DATA), COSINE) as Vector_Distance
2 FROM my_data_vectors
3 ORDER BY vector_distance(sentence_vector , VECTOR_EMBEDDING(ALL_MINILM_L12_V2_AUGMENTED USING 'quick fox' as DATA), COSINE)
4* FETCH FIRST 2 ROWS ONLY
SQL> /
ID MY_SENTENCE VECTOR_DISTANCE
------------ ------------------------------------------------ ---------------
1 The quick brown fox jumped over the lazy dog 3.22912E-001
4 the elephant knows the fox and dog 5.21092E-001
SQL>
--- "quick fox" has a better similarity with sentence 1 than sentence 4
What LLMs do when "generating" the next word (known as "token") is to find the "token" that has a higher probability -- a lower COSINE Distance


