Update 2015-06-09: Some users have reported problems with the shared sheet not generating the SQL as expected. If you encounter problems, try creating your own spreadsheet and then add the three code files (either from the shared sheet or from the GitHub link).
I work with spreadsheets and relational databases on a daily basis and frequently have to move data between them. I have written some scripts in the past in shell and Python that automated/semi-automated the process of transferring a tab-separated file into a relational database. These scripts generated the SQL CREATE TABLE statement using the column headers and also generated the upload scripts to transfer the data.
I decided to write a similar application in GAS that uses the column headers to generate the CREATE TABLE statement and then generates INSERT statements for each data row in the input sheet. Having now written the application, I am finding it more useful than I expected so I am hoping that others might also find it useful.
Setting Up The Application
Here is a link to the spreadsheet containing the application. This link lets you view only. To make your own copy of the application:
- Create a new spreadsheet and add a sheet named "PasteTableDataHere". The name you give to the spreadsheet is unimportant but the application expects to find the input data in a sheet with this name.
- Got to the GAS script Editor and create a project. The name does not matter, I called mine "spreadsheet2sql_code".
- Get the three files from this GitHub repository (Code.gs, index.html and spreadsheet2sql.gs). It is best to check out the code using a Git client but you can copy the data directly. It is best to copy from the GitHub "Raw" view because indentation is retained when copied from this source.
- In the GAS editor of your spreadsheet, create three empty files with these names and then copy and paste the code from each to your corresponding file.
- When you re-open the spreadsheet, the menu should appear and you will be able to display the form and run the application.
- If you need test data to get started, copy the data from the linked spreadsheet above into the "PasteTableDataHere" sheet and run the application on this small data set.
- Execute the generated SQL in the relational database that you chose on the form.
Overview Of Techniques CoveredApart from the practical utility of this application, the code also covers:
- The revealing module pattern
- HtmlServices GUI building
- General GAS programming techniques using, among other things, Range and Sheet objects
Application Requirements Overview
- Run from a HtmlService GUI where the GUI is displayed using a custom menu
- Output consisting of CREATE TABLE and INSERT SQL statements is written to a new spreadsheet
- Generates a primary key column in the output
- The output SQL statements can be written in one of three open source relational database SQL "dialects"; PostgreSQL, MySQL and SQLite
- The output can be pasted into a suitable database query tool (for example pgAdmin3 for PostgreSQL) and executed or can be saved to an SQL source file and executed by a database client program such as psql for PostgreSQL, mysql for MySQL and sqlite3 for SQLite
- Can assign data types to columns. Distinguishes between text, integer, floating point number and date types
- Able to recognise empty cells as NULLs that are accounted for correctly in the generated INSERT SQL statements
Images Of Input Sheet Data, The GUI, and Output Generated By Application
|Screenshot of input data in spreadsheet|
Above is a screenshot showing the input data. Those who have learned Oracle SQL some years ago may recognise it as being taken from the emp table of the old teaching data schema scott. I changed one value in the column entitled comm from 500 to 500.5 in order to force the application to detect it as a floating point number and thereby generate a floating point column type. The application is hosted in the spreadsheet named "SpreadsheetToSQL", this is shared below. Note that there is a custom menu entitled "Create SQL" on the right-hand side that appears when this spreadsheet is opened.
The GUISelection the "Create SQL" menu item "SQL Create Form" displays the following GUI:
There are four inputs (top to bottom):
- The name you want to give to the database table, emp here
- The target relational database management system (RDBMS), PostgreSQL in this example. The drop-down also has values for MySQL and SQLite.
- The name for the spreadsheet to which the SQL statements are written, ScottEmp in this example. The spreadsheet will be created and stored in the user's Google Drive.
- The name for the sheet to which the output SQL statements are written, scott_emp in this example.
The SQL OutputWhen the application has completed and the GUI has disappeared, go to your Google Drive (root folder) and open the newly generated spreadsheet. Here is what you should see:
All the output data is in column A. Cell A1 contains the CREATE TABLE statement and the rest are the INSERT statement for the data rows. Each line is terminated with the SQL semi-colon command termination characte.
The RDBMS chosen in the GUI was PostgreSQL so we can copy the statements above and paste them into the pgAdmin3 query window as shown below:
Hitting the execute button in the pgAdmin3 Query window will create the table emp and insert the 14 data rows into it. Note a minor inconvenience: The CREATE TABLE statement is enclosed in double quotes to preserve the embedded new line characters, these quotes have to be removed before executing the statement.
Code DiscussionI am not including the code here because it is quite long and it is extensively documented. If you wish to understand the code, then copy the spreadsheet and then read the comments given in the three files where the application is defined.
The application code is divided into three files:
- Code.gs is a standard GAS code file containing functions that display the menu, call the code to build the GUI, take the GUI input and pass it to a function that uses the module (spreadsheet2sql) to create the SQL statements, and write these statement to a newly created spreadsheet.
Concluding RemarksThe best way to learn from this blog entry is to make a copy of the spreadsheet hosting the application and to experiment with it and read the code and code comments. The application is useful to me and I hope it will also be useful to others. I have tested it quite heavily and used it frequently and, so far, it has worked quite well. However, I can offer no guarantees so please check inputs and outputs carefully and let me know of any bugs. I have used it for quite large data inputs of up to 30,000 data rows and it worked fine. Google Spreadsheets have improved enormously recently and a row count of 30,000 would not have been feasible in earlier versions. However, here are two important potential problems to bear in mind
- Floating Point Number Precision: For MySQL this application uses the DECIMAL type. If you need very high precision, check the data types defined for "float" for each of the three RDBMs and alter accordingly.
- Copying Large Amounts Of Data In Google Sheets: When I tried to copy the 30,000 rows to the sheet named "PasteTableDataHere", only about 1,000 rows were copied even though I selected the full sheet. I circumvented this problem by resorting to GAS to copy the full target sheet that I then re-named to "PasteTableDataHere". I will post an entry on this in the near future.
This application could be extended in a number of ways. PostgreSQL and MySQL support the concept of metadata where documentation can be entered into the data dictionary to describe database objects such as tables and columns. In PostgreSQL the SQL syntax to this is like:
COMMENT ON COLUMN my_table.my_column IS 'Employee ID number';
One approach to document the columns could be to add notes to the header row cells that could then be used to generate comments like the one given above. A textarea input could also be added to the HTML form to allow the user to enter multi-line table documentation into the RDBMS data dictionary. Believe me, this type of documentation is crucial when dealing with complex database schemas.
As stated earlier, I will add a chapter to my book that will discuss this application and the underlying code in greater detail. I will also re-visit this blog entry from time to time to improve it. The linked spreadsheet application will also be reviewed and improved so, if interested, please re-visit this page to see how it has changed. Meanwhile, happy coding!