Wednesday, April 1, 2015

Automated Generation Of SQL From Google Spreadsheets Using The Revealing Module Pattern

Update 2015-06-02: Fixed a bug that affected code generated for MySQL only where the CREATE TABLE statement column types were all assigned "undefined". Thanks to Ivan for spotting that and alerting me. Code on GitHub now fixed too.

Update 2015-06-09: Some users have reported problems with the shared sheet not generating the SQL as expected. If you encounter problems, try creating your own spreadsheet and then add the three code files (either from the shared sheet or from the GitHub link).

Introduction

I am currently writing an update of my Google Spreadsheet Programming book (free for those who buy, or already have bought, the current version). I want to include a new chapter that describes a large and potentially useful application that brings together, in a coherent manner, many of the topics described in the book.
I work with spreadsheets and relational databases on a daily basis and frequently have to move data between them. I have written some scripts in the past in shell and Python that automated/semi-automated the process of transferring a tab-separated file into a relational database. These scripts generated the SQL CREATE TABLE statement using the column headers and also generated the upload scripts to transfer the data.
I decided to write a similar application in GAS that uses the column headers to generate the CREATE TABLE statement and then generates INSERT statements for each data row in the input sheet. Having now written the application, I am finding it more useful than I expected so I am hoping that others might also find it useful.
If you use relational databases and would like to automate generation of both CREATE TABLE and SQL INSERT statements, then read on. Even if this is not an area of interest for you but you are interested in GAS/JavaScript, then you may derive some benefit from it too as it employs a useful JavaScript pattern and builds a practical, if plain, GUI.

 

 Setting Up The Application


Here is a link to the spreadsheet containing the application. This link lets you view only. To make your own copy of the application:
  • Create a new spreadsheet and add a sheet named "PasteTableDataHere". The name you give to the spreadsheet is unimportant but the application expects to find the input data in a sheet with this name.
  • Got to the GAS script Editor and create a project. The name does not matter, I called mine "spreadsheet2sql_code".
  • Get the three files from this GitHub repository (Code.gs, index.html and spreadsheet2sql.gs). It is best to check out the code using a Git client but you can copy the data directly. It is best to copy from the GitHub "Raw" view because indentation is retained when copied from this source.
  • In the GAS editor of your spreadsheet, create three empty files with these names and then copy and paste the code from each to your corresponding file.
  • When you re-open the spreadsheet, the menu should appear and you will be able to display the form and run the application.
  • If you need test data to get started, copy the data from the linked spreadsheet above into the "PasteTableDataHere" sheet and run the application on this small data set.
  • Execute the generated SQL in the relational database that you chose on the form.
  • Please feel free to email me if you have any difficulty (mick@javascript-spreadsheet-programming.com).

Overview Of Techniques Covered

Apart from the practical utility of this application, the code also covers:
  • The revealing module pattern
  • HtmlServices GUI building
  • Client side JavaScript
  • Menus
  • JavaScript array manipulation
  • General GAS programming techniques using,  among other things, Range and Sheet objects

Application Requirements Overview

  • Run from a HtmlService GUI where the GUI is displayed using a custom menu
  • Output consisting of CREATE TABLE and INSERT SQL statements is written to a new spreadsheet
  • Generates a primary key column in the output
  • The output SQL statements can be written in one of three open source relational database SQL "dialects"; PostgreSQL, MySQL and SQLite
  • The output can be pasted into a suitable database query tool (for example pgAdmin3 for PostgreSQL) and executed or can be saved to an SQL source file and executed by a database client program such as psql for PostgreSQL, mysql for MySQL and sqlite3 for SQLite
  • Can assign data types to columns. Distinguishes between text, integer, floating point number and date types
  • Able to recognise empty cells as NULLs that are accounted for correctly in the generated INSERT SQL statements

Images Of Input Sheet Data, The GUI, and Output Generated By Application

Screenshot of input data in spreadsheet

The Input Data




Above is a screenshot showing the input data. Those who have learned Oracle SQL some years ago may recognise it as being taken from the emp table of the old teaching data schema scott. I changed one value in the column entitled comm from 500 to 500.5 in order to force the application to detect it as a floating point number and thereby generate a floating point column type. The application is hosted in the spreadsheet named "SpreadsheetToSQL", this is shared below. Note that there is a custom menu entitled "Create SQL" on the right-hand side that appears when this spreadsheet is opened.

The GUI

Selection the "Create SQL" menu item "SQL Create Form" displays the following GUI:

There are four inputs (top to bottom):
  1. The name you want to give to the database table, emp here
  2. The target relational database management system (RDBMS), PostgreSQL in this example. The drop-down also has values for MySQL and SQLite.
  3. The name for the spreadsheet to which the SQL statements are written, ScottEmp in this example. The spreadsheet will be created and stored in the user's Google Drive.
  4. The name for the sheet to which the output SQL statements are written, scott_emp in this example.
Once the "Make SQL" button is clicked, the application begins generating the output SQL. When finished, a "Done" message box appears and the GUI disappears. If there has been an error, check the View->Logs menu in the script editor window.

The SQL Output

When the application has completed and the GUI has disappeared, go to your Google Drive (root folder) and open the newly generated spreadsheet. Here is what you should see:



All the output data is in column A. Cell A1 contains the CREATE TABLE statement and the rest are the INSERT statement for the data rows. Each line is terminated with the SQL semi-colon command termination characte.

The RDBMS chosen in the GUI was PostgreSQL so we can copy the statements above and paste them into the pgAdmin3 query window as shown below: