Microsoft has a lot to be proud of. They helped pioneer desktop computing. They have very successful platforms like SQL Server and Sharepoint. They are even working on a new mobile phone that doesn’t support copy and paste. But, I would argue their most popular product in use today is Excel.

Looks nice, but think twice before you do it!

Mayhaps ‘popular’ isn’t the best word here. Most reasonable people will probably agree that Excel is the predominant number crunching tool in the marketplace. When you last asked someone to send you some data, what format did you receive it in? Uh-huh.

When it comes to the database, people like to take data out of it. A lot. Every successful tool I have used for any database platform has

  • the ability to read data out of the database and write to Excel
  • the ability to read data out of Excel and write to the database.
  • Knowing that people are doing this every day must seriously freak out anyone who is responsible for data integrity and security. Once data leaves the database, it’s almost impossible to protect. Let me make a very important distinction here – I am talking about production systems. If you are in R&D or Quality Control, you have little to worry about.

    An emerging trend I have encountered while working with public sector accounts in the US is that users are being prohibited from saving data permanently outside the database. Writing to Excel is no longer a whiz-bang feature. Instead, it is something to cause one to worry about their job security.

    I think it’s only a matter of time before this becomes the new de facto standard.

    Thankfully there are many ways around this new paradigm. For one, you could easily crunch data in an instance of Excel, and not save the file to disk.

    Do me a favor, go to your Documents folder and do a quick count of how many XLS/XLSX files you have.
    Would you or your employer be nervous if your machine was lost or stolen? Many sites are going to the extreme of encrypting every laptop in an organization. I predict that will become the norm in the not too-distant future.

    For the record, I WOULD be nervous.

    Why are people insisting on using Excel?
    I think the most obvious reason is they are comfortable with pivot tables, excel macros, and the like than they are with SQL constructs like GROUP BY and relational algebra and group theory. These folks probably shouldn’t even be in the database to begin with. We wonder why their queries aren’t the most spectacular, well it’s no wonder. They have been placed in a position to grab data from a database, and no one has made sure they speak the language.

    There’s a reason people write SELECT *, because they can. Steve Jones and I can talk till we’re blue in the face, but users will find the quickest way to their end goal, and damn the consequences. We can’t really blame them either. Business users and developers have tons of work to accomplish with inadequate resources, budget, and timeline.

    What we can do is continue to educate the community and always be ready to give great advice when a user asks for help.

    Speaking of those R&D folks that are using Excel to upload data to the database…
    Save those files as a CSV and build a SQL*Loader control file to load your data to Oracle. You’ll pick up a new trick for your toolkit, and when you get to those 1,000,000 spreadsheets, the database will churn threw that data faster than you can imagine. And as @patch72 and @leight0nn reminded me yesterday, check out an External Table while you’re at it.

    Author

    I'm a Distinguished Product Manager at Oracle. My mission is to help you and your company be more efficient with our database tools.

    8 Comments

    1. The problem, Jeff, is that we have a revolution in data availability that is far ahead of the evolution in corporate structures. Where databases were once the arcane domain of programmers that toiled in front of green screens crunching JCL against VSAM files a scant 25 years ago, they are now ubiquitous in the business world. Even secretaries now create and maintain databases, even if they do not know they are doing so.

      IT departments have not kept pace with the trend because so much of the database development and usage has happened under the radar on the desktop of individual users who have MS Office or MS CRM tools on their personal machines. I am sure most CIOs would break out in a cold sweat if they really knew how much mission critical data or legally protected data lives on individual PCs (laptops these days). Most CEOs would run and hide if they knew how far removed the spreadsheets they receive are from the corporate data stores they purport to represent. When I open up a 50 MB spreadsheet with 20 tabs filled with thousands of vlookup formulas I could not audit in a year, I want to scream. I know something is probably wrong in there, but I’ll never find it in the hour I have to review it.

      The fault there lies squarely with IT departments who have not kept pace with the needs of the business community to access, analyze, and report on data in real time and in an environment where data needs change daily. Business users understand that it takes time to build a robust database, and they don’t have time — so they build a spreadsheet or an Access database because that is all that they have. It works…for a short period.

      In order to break out of that paradigm, IT departments will have to start decentralizing and embedding database professionals in business units where they can catch these efforts and steer the data management into a structured environment before things are out of control and where they can demonstrate the power of well written queries over spreadsheets or Access debacles.

      Until we make that cultural and structural shift, business users will continue to proliferate data outside the enterprise database environment.

      • JeffS

        You’ve got lots of competition in that space, but thanks for the heads up, never seen that commercially delivered as a PL/SQL package before.

        So I guess this means you’re on the ‘Yes you should’ side of the argument?

    2. Pingback: Three Ways to get Data into Excel with Toad | 140,000 Characters or Less

    3. Pingback: Access of Evil | 140,000 Characters or Less

    4. Pingback: Spreadsheet Databases | Noel NOT NULL;

    5. I find people having a hard time mastering JOINs Patrick, but you are absolutely right. APEX could be a silver bullet solution here, or it could be easily abused.

    6. The current versions of Oracle can do (almost) everything in SQL that Excel can. Using the pivot clause, Analytical Functions etc makes the use of Excel obsolete.
      If you use APEX for instance, you can create a simple application surrounding the data and keep full control over the security of it.
      Kindest regards,
      Patrick ‘@Patch72’ Barel

    Write A Comment