Fine-grained AI Enrichment for Oracle Database via MCP

Oracle SQLcl version 26.1 came out last week.

From the release notes, we have this nugget –

Schema-information tool now supports filtering and ability to retrieve details for other schemas

So what does this mean?

I’ve used my handy agent to summarize this tool for me, but the TL;DR is simply you can now ask our MCP Server to deliver information about ANY schema in the database (ones that you have access to, vs only the connected schema), and you can also ask to filter the metadata you want, and the level of metadata you want.

My ask 🤖 : Give me an overview of our schema-information tool – I want to use it to get specific data from my oracle database to assist with upcoming NL2SQL tasks here.

Agent output ⬇️

Here’s a practical breakdown of the schema-information tool and how to use it effectively for NL2SQL prep:

`sqlcl:schema-information` — Overview

This tool introspects an Oracle schema and returns structured metadata — essentially giving an LLM a map of your database so it can write accurate SQL.

Parameters

Parameter	Default	Notes
`schema`	Current connected schema	Specify any schema you have access to, e.g. `HR`, `SH, OE`
`level`	`BRIEF`	`BRIEF` = name + type only; `DETAILED` = full column/constraint/index info
`objects_list`	All objects	Comma-separated list to narrow to specific tables/views/etc.
`filter`	`%`	SQL `LIKE` pattern on object names, e.g. `EMPLOYEE%`
`executionType`	`SYNCHRONOUS`	Use `ASYNCHRONOUS` for large schemas that take >2s
`model`	`UNKNOWN-LLM`	Pass the model name for logging/telemetry purposes

Two Key Modes for NL2SQL

Quick inventory — Use BRIEF to see what objects exist in a schema before deciding what to pull in detail:

schema: HR, level: BRIEF

Deep context for NL2SQL — Use DETAILED with a targeted objects_list to get column names, data types, constraints, and relationships for the tables the LLM will be writing SQL against:

schema: HR, level: DETAILED, objects_list: EMPLOYEES,DEPARTMENTS,JOBS

Practical Strategy for NL2SQL

Rather than dumping your entire schema into a prompt (expensive and noisy), a good pattern is:

Run BRIEF on the whole schema to inventory everything
Identify the relevant tables for your NL2SQL domain
Run DETAILED on just those tables
Paste that output as the schema context block in your NL2SQL prompt

Trying it for real, my prompt:

My ask 🤖 : connect to my Raptor database, use the information tool to get a simple overview of the SH schema.

I want an overview of sales orders, their products, and the customers involved. Use the mcp tool to get what you need for any NL2SQL requests in this chat vs running dictionary queries on your own please.

And the agent proceeds to execute on its plan, including the schema-information tool.

And the Agent response ⬇️

Here’s a clean overview of the SH schema, focused on your sales/products/customers domain:

SH Schema — NL2SQL Context

Core Tables (your focus area)

SALES — the fact table, partitioned (15 partitions) PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID, QUANTITY_SOLD, AMOUNT_SOLD → Joins to every dimension table. No surrogate PK — grain is the combination of all FK columns.

CUSTOMERS — rich dimension CUST_ID, CUST_FIRST_NAME, CUST_LAST_NAME, CUST_GENDER, CUST_YEAR_OF_BIRTH, CUST_MARITAL_STATUS, CUST_CITY, CUST_STATE_PROVINCE, COUNTRY_ID, CUST_INCOME_LEVEL, CUST_CREDIT_LIMIT, CUST_EMAIL → FK to COUNTRIES

PRODUCTS — hierarchical product dimension PROD_ID, PROD_NAME, PROD_DESC, PROD_SUBCATEGORY, PROD_CATEGORY, PROD_LIST_PRICE, PROD_MIN_PRICE, PROD_STATUS, SUPPLIER_ID → Has SCD columns (PROD_EFF_FROM, PROD_EFF_TO, PROD_VALID)

TIMES — very rich time dimension TIME_ID (DATE), DAY_NAME, CALENDAR_MONTH_DESC, CALENDAR_QUARTER_DESC, CALENDAR_YEAR, FISCAL_YEAR, FISCAL_QUARTER_DESC → Both calendar and fiscal hierarchies

CHANNELS — small lookup CHANNEL_ID, CHANNEL_DESC, CHANNEL_CLASS

COUNTRIES — geographic rollup COUNTRY_ID, COUNTRY_NAME, COUNTRY_SUBREGION, COUNTRY_REGION

Key Relationships for NL2SQL

SALES ──┬── CUSTOMERS (CUST_ID)     → who bought
        ├── PRODUCTS  (PROD_ID)     → what they bought
        ├── TIMES     (TIME_ID)     → when they bought
        ├── CHANNELS  (CHANNEL_ID)  → how they bought
        └── PROMOTIONS (PROMO_ID)  → any promo involved

CUSTOMERS ── COUNTRIES (COUNTRY_ID) → where they are from

Also Available (supporting context)

COSTS — partitioned cost fact table (unit cost, unit price by product/time/channel)
PROMOTIONS — promo metadata with date ranges and category hierarchy
SUPPLEMENTARY_DEMOGRAPHICS — extended customer attributes (education, occupation, household size, sports affinity)
PROFITS (VIEW) — pre-built view likely combining sales and costs
CAL_MONTH_SALES_MV / FWEEK_PSCAT_SALES_MV — materialized views for fast aggregation

Ready for NL2SQL queries! If you want, I can pull DETAILED metadata on any of these tables (especially SALES, CUSTOMERS, or PRODUCTS) before we start — that’ll give exact data types, constraints, and indexes for the LLM context.

What just happened

I’m using our MCP Server to work with my Oracle AI Database…
with a connection tied to database user, HR
to get information about a schema, SH
without running ANY data dictionary queries

A follow-up question –

My ask 🤖 : find the top 5 customers and what they were buying

My agent wants more info, so it’s going to call the information tool again, but this time with more details and only for 3 specific objects – in this case, tables.

Once the metadata was returned, by Agent was able to proceed the NL2SQL portion of the task, and execute said SQL.

Again, then agent response ⬇️

Thinking about identifying top 5 customers and their purchasing patterns

First, let me grab the detailed metadata for the three tables we need:

Architected SQL query for customer spending analysis

Schema context confirmed. Now running the query — top 5 customers by total spend, with their most-purchased product categories:

Run-sql

SQL

WITH top_customers AS (
    SELECT
        c.CUST_ID,
        c.CUST_FIRST_NAME || ' ' || c.CUST_LAST_NAME AS customer_name,
        c.CUST_CITY,
        c.CUST_STATE_PROVINCE,
        SUM(s.AMOUNT_SOLD)    AS total_spent,
        SUM(s.QUANTITY_SOLD)  AS total_items,
        COUNT(*)              AS total_transactions
    FROM SH.SALES s
    JOIN SH.CUSTOMERS c ON s.CUST_ID = c.CUST_ID
    GROUP BY c.CUST_ID, c.CUST_FIRST_NAME, c.CUST_LAST_NAME, c.CUST_CITY, c.CUST_STATE_PROVINCE
    ORDER BY total_spent DESC
    FETCH FIRST 5 ROWS ONLY
),
top_category AS (
    SELECT
        s.CUST_ID,
        p.PROD_CATEGORY,
        SUM(s.AMOUNT_SOLD) AS cat_spent,
        RANK() OVER (PARTITION BY s.CUST_ID ORDER BY SUM(s.AMOUNT_SOLD) DESC) AS rn
    FROM SH.SALES s
    JOIN SH.PRODUCTS p ON s.PROD_ID = p.PROD_ID
    WHERE s.CUST_ID IN (SELECT CUST_ID FROM top_customers)
    GROUP BY s.CUST_ID, p.PROD_CATEGORY
)
SELECT
    tc.customer_name,
    tc.cust_city || ', ' || tc.cust_state_province AS location,
    TO_CHAR(tc.total_spent, '$999,999.99')    AS total_spent,
    tc.total_transactions,
    tc.total_items,
    cat.PROD_CATEGORY                         AS top_category
FROM top_customers tc
JOIN top_category cat ON tc.CUST_ID = cat.CUST_ID AND cat.rn = 1
ORDER BY tc.total_spent DESC

WITH top_customers AS (
    SELECT
        c.CUST_ID,
        c.CUST_FIRST_NAME || ' ' || c.CUST_LAST_NAME AS customer_name,
        c.CUST_CITY,
        c.CUST_STATE_PROVINCE,
        SUM(s.AMOUNT_SOLD)    AS total_spent,
        SUM(s.QUANTITY_SOLD)  AS total_items,
        COUNT(*)              AS total_transactions
    FROM SH.SALES s
    JOIN SH.CUSTOMERS c ON s.CUST_ID = c.CUST_ID
    GROUP BY c.CUST_ID, c.CUST_FIRST_NAME, c.CUST_LAST_NAME, c.CUST_CITY, c.CUST_STATE_PROVINCE
    ORDER BY total_spent DESC
    FETCH FIRST 5 ROWS ONLY
),
top_category AS (
    SELECT
        s.CUST_ID,
        p.PROD_CATEGORY,
        SUM(s.AMOUNT_SOLD) AS cat_spent,
        RANK() OVER (PARTITION BY s.CUST_ID ORDER BY SUM(s.AMOUNT_SOLD) DESC) AS rn
    FROM SH.SALES s
    JOIN SH.PRODUCTS p ON s.PROD_ID = p.PROD_ID
    WHERE s.CUST_ID IN (SELECT CUST_ID FROM top_customers)
    GROUP BY s.CUST_ID, p.PROD_CATEGORY
)
SELECT
    tc.customer_name,
    tc.cust_city || ', ' || tc.cust_state_province AS location,
    TO_CHAR(tc.total_spent, '$999,999.99')    AS total_spent,
    tc.total_transactions,
    tc.total_items,
    cat.PROD_CATEGORY                         AS top_category
FROM top_customers tc
JOIN top_category cat ON tc.CUST_ID = cat.CUST_ID AND cat.rn = 1
ORDER BY tc.total_spent DESC

Here are your top 5 customers by lifetime spend:

Customer	Location	Total Spent	Transactions	Top Category
Dora Rice	Yokohama, Kanagawa	$103,412.66	248	Golf
Lolita Katz	Oran, MO	$99,578.09	256	Golf
Tesia Eppling	Solingen, NRW	$98,585.96	266	Golf
Xerxes Abbassi	Wakefield, England	$98,006.16	235	Golf
Rose Lehman	Yokohama, Kanagawa	$97,573.55	240	Golf

A few things jump out:

Golf dominates — every top spender’s #1 category is Golf, which is worth digging into. That’s likely a high-ticket product category in this dataset.
Quantity sold = transaction count for all of them, meaning each transaction is a single unit — no bulk buying at the top end.
Internationally spread — Japan, Germany, UK, and US all represented. The SH schema’s global customer base is showing up.
The spend range is pretty tight ($97K–$103K), suggesting these are consistently high-frequency buyers rather than one-time big spenders.

But Jeff, I have a HUGE, gnarly dictionary. It takes awhile to get these schema information reports back!

MCP Tool requests will timeout after 60 seconds. So it’s important to support long running requests, asynchronously.

Simply ask your agent to plan for a long running task, it will know to switch to the asynch task mode. It will then ask for task updates until the results are ready.

sqlcl:schema-information — Overview