Signing up for an Autonomous Database is easy. You can have an instance up and running in just a few minutes. And now, you can even have one for FREE.

But one of the first things you’re going to want to do is shove some TABLEs into your schema, or just load some data.

We’re working on making this even easier, but let’s quickly recap what you can already do with our tools.

A Saucerful of “Secrets” Ways to Load Data to the Cloud with our DB Tools

Taking Advantage of AUTO TABLE and ORDS

If you already have a TABLE in your schema, and you want to create a REST API for accessing said table, we make that easy.

It’s a right-click in SQL Developer (Desktop)

This UI is coming to SQL Developer Web, soon.

Or you could of course just run this very simple PL/SQL block –

BEGIN
    ORDS.ENABLE_OBJECT(p_enabled => TRUE,
                       p_schema => 'JEFF',
                       p_object => 'HOCKEY_STATS',
                       p_object_type => 'TABLE',
                       p_object_alias => 'hockey_stats',
                       p_auto_rest_auth => TRUE);
    COMMIT;
END;

Another quick aside, if you need to catch up on these topics, I’ve talked about creating your application SCHEMA and REST Enabling it for SQL Developer Web access.

And, I’ve talked about using the CSV Load feature available with the ORDS AUTO Table mechanism.

My TABLE

I have a HOCKEY_STATS table, that I want to load from some CSV data I have on my PC. It’s an 8MB file (35000 rows, 70 columns).

Now, I could use the Import from CSV feature in SQL Developer (Desktop) to populate the TABLE…

Took approximately 16 seconds to batch load 35,000 records to my Autonomous Database service running in our Ashburn Data Center from Cary, NC – using the MEDIUM Service.

That’s not super quick, but it was super easy.

But what if I need a process that can be automated? And my API du jour is HTTP and REST?

Let’s POST up my CSV to the TABLE API

Let’s find the URI first. Go into your Development Console page for your service – you’ll see we show you what all of your ORDS API calls will start with:

You don’t have to guess or reverse engineer your ORDS calls from the SQL Developer Web or APEX links anymore.

After the ‘/ords/’ I’m going to include my REST Enabled SCHEMA alias, which I have specified as ‘tjs’ in place of ‘JEFF’, and then my TABLE alias, which I’ve just left as ‘hockey_stats’.

So if I want to do a CSV load, I need to HTTPS POST to

https://ABCDEFGHIJK0l-somethingash.adb.us-ashburn-1.oraclecloudapps.com/ords/tjs/hockey_stats/batchload?batchRows=1000

The ‘/batchload?batchRows=1000’ at the end tells ORDS what we’re doing with the TABLE, and how to do it. This is documented here – and you’ll see there’s quite a few options you can tweak.

Before I can exercise the API, I need to assign the ORDS Privilege for the TABLE API to the ‘SQL Developer’ ORDS Role. That will let me authenticate and authorize via my ‘JEFF’ Oracle database user account.

There’s a PL/SQL API for this as well as an interface in APEX.

If that sounds ‘icky’ then you can also take advantage or our built-in OAUTH2 Client (example).

Now, let’s make our call. I’m going to use a REST Client (Insomnia) but I could easily just use cURL.

Almost 10 seconds…not blazing fast, but again, very easy (no code!) and it’s sending 8MB over HTTP….

I could tweak the batchRows parameter, and see if I could get faster loads, I’m sure I could. But the whims of public internet latency and the nature of the data I’m sending up in 16 KB chunks will make this a fun ‘it depends’ tech scenario.

30 September 2020 Update – 5M Rows, ~27 sec!

I decided to throw a slightly larger scenario at ORDS in Autonomous.

133 MB of CSV – 5,000,000 rows including the 1 header row

And our data…

Live in Autonomous ~25 seconds after hitting POST in my local REST client

Note this scenario was also from Cary, NC to our Cloud Data Center in Virginia…and of course your results may vary based on current network loads.

I also increased the batchRows setting up to 5,000 from 1,000. I also tried 7,500 and 10,000 but didn’t see any additional performance improvements, but this also is NOT a scientific test.

Some more notes on settings (Docs).

batchRows – The number of rows to include in each bath – so we’re inserting/committing 5,000 rows at a time, for a total of 100 batches.

We could include additional settings…

errors and errorsMax in particular. You may want to set errorsMax to a reasonable number. In other words, do you want to give up on the load if say, more than 10% fail to be inserted. For debugging, I would suggest setting it to 1 or say 10 so your test ‘fails fast’ and doesn’t consume unnecessary server resources.

And of course, results may vary, but I got this to run as ‘fast’ as 25 seconds and as slow as 29 seconds.

Tried one more time before hitting the ‘Update/Publish’ button 🙂

I ran this scenario 10x each, and also for 10M, and 20M rows of this same data. I found that running the loads at 5M row batches outperformed the larger ones. That is, I could kick off 2, 5M load requests and get it done MUCH faster than a single 10M load.

For example, running 10M rows, 10 times, my average load time was 90 seconds, vs 30 seconds for the 5M loads.

When I kicked off 2 5M runs concurrently, both were finished in less than 27.4 seconds!

Mostly scientific, mostly.

The cURL

I hate cURL. It took me more than a few minutes, and ended up having to Zoom with @krisrice to finally get this JUST right…

curl --write-out '%{time_total}' -X POST --data-binary "@5M.csv" -H "Content-Type:text/csv" --user <user>:<password> "https://....adb.us-ashburn-1.oraclecloudapps.com/ords/<user>/huge_csv/batchload?batchRows=5000&errorsMax=20"

Don’t forget the Content-Type header! I was doing that and weird things were happening. Also had to hop to SO to figure out how to stream in the contents of the file, and then also learned a cool trick of pulling the HTTP response time in the call. I’m sharing this for ME, because I’ll forget it tomorrow.

thatjeffsmith
Author

I'm a Distinguished Product Manager at Oracle. My mission is to help you and your company be more efficient with our database tools.

Write A Comment