Wednesday, September 25, 2013

Regular Expressions - Part 1

A complete version of this and the following blog entry will be added as an appendix to this leanpub book.

Update 3rd-Sep-2014: Published almost a year after Part 1, here is Regular Expressions - Part 2 (Much longer and more detailed than this entry!)

Introduction

A regular expression describes a text pattern. Regular expressions can be used to locate complex patterns within textual data and these patterns can then be replaced with new text. Regular expressions have a reputation for being difficult to write and even more difficult to maintain. There is at least a grain of truth in a witticism attributed to Jamie Zawinski:

Some people, when confronted with a problem, think
“I know, I'll use regular expressions.”   Now they have two problems.


Source of the quote is here.

Some factors that contribute to the complexity of regular expressions are:
  1. The cryptic and terse nature of the syntax.
  2. The same symbols can have different meanings in different contexts.
  3. Multiple ways specifying patterns for the same purpose.

To get a feel for how complex regular expressions can become, have a look at some of the examples here! This expression is supposed to test for a valid email address, I have not tested it so cannot say if it does what it is supposed to do but it is worth having a look at the to get a feel for how truly complex regular expressions can become.

Despite their potential complexity, regular expressions are really useful tools and some of the most useful ones are not at all difficult. They crop up in many different settings and can greatly simplify task that would be exceedingly difficult without them. There is lots of freely available information on them on the internet and a simple Google search can often be used to find a ready-made solution that someone  else has written and then posted on a website or blog for others to use. As with any code taken from the internet, it is really important to test it to ensure it works as advertised.

Regular expressions, from here on referred to as REs, first became popular in UNIX text editing tools such as ed and sed back in the 1970s. Perl greatly extended the power of the sed implementation and and Perl REs have become the de facto standard to such an extent that other languages frequently describe their own RE implementations as "Perl compatible". Although Perl has done much to both popularise and extend REs, other languages, including JavaScript, now have good RE implementations of their own as do many relational databases such as Oracle, MySQL and PostgreSQL.

JavaScript's RE syntax and semantics were borrowed directly from Perl but some advanced Perl RE techniques are absent.  Google Apps Script implements JavaScript REs in full so client side RE examples can often be used in Google Apps Scripts and vice versa.

Incidentally, REs have been available in Excel VBA for some time (on Windows but not on Mac OS X) via a VBScript library that has to be loaded from Visual Basic Editor (Tools->References->Microsoft VBScript Regular Expression 5.5) . This fact is often over-looked by VBA developers.

Google Spreadsheet Regular Expression Functions

Google Spreadsheets has an extensive range of built-in functions for text manipulation; The full list is here. Many of these are also found in Excel and in other spreadsheet applications but three RE-related ones are unique to Google Spreadsheets (to the best of my knowledge):

  1. REGEXMATCH
  2. REGEXEXTRACT
  3. REGEXREPLACE

These three functions provide a convenient starting point for introducing REs without having to jump straight into JavaScript.

REGEXMATCH is the easiest of the three, so examples of its use are given first. It simply returns a Boolean value indicating if the given pattern is found in the input text.

Figure 1: Testing function REGEXMATCH with input and regular expressions.

Figure 1 above is a screenshot of a spreadsheet where the REGEXMATCH function was tested. Column A contains the test input, column B contains the result of formula given in column C. The question the formula was written to answer is stated in column C.

The REGEXMATCH function takes two arguments: The test text and the regular expression enclosed in double quotes. The test text can be either a cell value, the return text from another spreadsheet function, or a literal value. If it is a literal value then it too must be enclosed in double quotes.

Here is a very brief description of the REs given above:

  • The first, "\d", matches any digit (0-9). If the input contains at least one digit, it returns TRUE. The "\d" character combination is what is known as a character class abbreviation. The backslash is a metacharacter which essentially means that it signifies something other than its literal meaning. When it precedes a "d" in an RE the combination means a single instance of any digit. This can also be specified using a character class where it is specified as "[0-9]". This is an example of the redundancy in REs, there are frequently equivalent alternative ways of specifying patterns.
  • The second RE, "[^a-zA-Z0-9.\s]", is a more complex example. The objective here is to determine if the test text contains any character that is not a digit, period (dot), or an English alphabet letter. It uses the a fore-mentioned character class. This is a negated character class, its negated nature is conferred by the caret (^). The caret is another metacharacter but it is a metacharacter with different meaning depending on its context. When present as the first character in a character class (that is within [ and ]), it means a match will be found if some character not specified within the character class is found in the test string. The example here uses a character class range, "a-z" and "A-Z" mean any lower case or upper case letter, respectively. 0-9 means any digit and is equivalent to "\d". The period is another interesting character because inside a character class it represents itself but outside one, it means any character except a new line character. The examples that return TRUE for the test RE do so because they contain currency symbols. The "\s" within the character class is another example of a character class abbreviation and it represents the space character, tab, and new line.
  • The "^\d" RE returns TRUE when its test text begins with a digit. The caret symbol in this context is an anchor; It does not represent a character but rather a position, the equivalent for the end of the string is the dollar ($) symbol. The caret character has a dual role in REs, in the earlier example, when it is the first character within a character class it negates the meaning.
  • The RE "colou?r" matches an input string if it contains the string "color" or "colour". This could either be as a word or as part of a word such as "de-colouring" for example. The "?" is another metacharacter with multiple meanings. In this context, the immediately preceding character the "u" in this example is optional, is made optional by the "?".
  • The final RE, "^\d+$", returns TRUE only if the test text is an integer. The caret and dollar metacharacters are both anchors in this context meaning start and end of a the string, respectively. The TO_TEXT spreadsheet function is necessary to convert the numbers into text. There is no automatic coercion of numbers into text and the REGEXMATCH function expects text.

The REGEXEXTRACT And REGEXREPLACE Functions

REs  can also be used to great effect with these two functions.

The REGEXREPLACE is the more useful of the two functions. It takes three arguments:
  1. The text input text. This can be from a spreadsheet cell, the result of a spreadsheet formula, or a string literal.
  2. The regular expression to be replaced which is enclosed in double quotes.
  3. The string literal to replace the regular expression, also enclosed in double quotes.

REGEXREPLACE executes a global replace, all occurrences of the regular expression are replaced.
Some examples are shown in Figure 2 below:

Figure 2: Testing the REGEXREPLACE Function

Figure 2 above is a screenshot of a spreadsheet where the REGEXREPLACE function was tested. Column A contains the test input, column B contains the result of formula given in column C. The target pattern to be replaced is described in column C.

Here is a brief description of the REs shown in figure 2 above.
  • The first RE "\d" simply deletes all digits from the input text where "\d" represents any digit as explained in the REGEXMATCH examples.
  • The second example deletes all lower-case letters. These are identified by the RE "[a-z]".
  • The third example ("[-]+") replaces one or more consecutive  dash characters with a single space. The plus sign (+) is another metacharacter and means one or more of the preceding character. The ? just means 0 or 1.
  • The RE "\b" in the fourth example introduces another anchor; The word boundary anchor \b. Like the string start and end ^ and $ anchors, \b does not represent characters. Instead it represents the non-word-word boundary. The RE's understanding of a word is broader than that of a human language. Any run of one or more alphanumeric characters or underscore are considered to be words. The interesting point in this example is that the anchor \b can be replaced with characters, in this example the caret (^) was used.
  • The last example replaces the RE "\d+$" with nothing. In effect, it deletes any trailing digits in the text. 

The REGEXEXTRACT function extracts only the first occurrence of an RE in the input text. For example, this invocation returns 23.

=REGEXEXTRACT("Total Cost € 23:", "\d+")

However, it would be much more useful if it were to return an array of matches. The SPLIT function can return an array elements so that this formula:

=SPLIT("123:456:789", ":")

returns the values of the resulting array in separate cells. Unfortunately, the SPILT function does not accept an RE as a delimiter as the split function in Perl, for example, does

This article has just introduced REs using the spreadsheet functions that Google provides. Part 2 will take REs further by showing how they can be written in Google Apps Script/JavaScript.



Friday, July 5, 2013

Manipulating Documents From Spreadsheets

Introduction
Google Spreadsheets can run scripts to interact with and manipulate Google applications and services other than the host spreadsheet application itself.  For example, earlier blog postings showed how spreadsheet-hosted scripts could be used to interact with gmail and the MySQL database.  Much of the power of Google Apps Scripting derives from just such interactions with other services and applications. This posting shows how a Google Apps script hosted by a spreadsheet can be used to create and manipulate a Google document. An earlier blog entry mentioned that it has recently become possible to write Google Apps scripts hosted by the Documents application. This means it is now possible to write Google Apps Script code hosted by the Document application that can be used across Google applications and services.

Task
To demonstrate the cross-application nature of Google Apps Script, I have written some code to copy the cell values of a named spreadsheet range to a table in a newly created document. The code also makes the header row of the table bold.

Code
The source code can be viewed here on GitHub. To facilitate the subsequent discussion, here is the same code viewed from Pastebin (to see the code, note that the Pastebin iframe has its own vertical scrollbar):





Spreadsheet Test Data
Here is a screenshot of sample data that I wish to copy into a Document table:
Add  some data like this to your own spreadsheet and give the data range a name (Menu Action: Data->Named and protected ranges...). For the code below the name given was "ContactDetails". See an earlier posting on how to define and pro-grammatically use named spreadsheet ranges.

Usage
  • Once the data has been entered, paste the above code into the spreadsheet's script editor.
  • There are three functions, the one to call is unimaginatively named main().
  • The first time main() is called, you will be prompted to "authorize" it because it needs to write a new file to your Google Drive.
  • Once authorized and run, a messagebox dialog should appear in the spreadsheet view to indicate success or failure (remember to switch from the Script Editor view to the Spreadsheet view to see messagebox!).
  • If the code has run successfully, a new document should appear in your Google Drive called "ContactDetails".
Using the input above, my document output looks like the following:
The code shown earlier has successfully created and saved a new document and copied a range of spreadsheet values into this new document. It has also manipulated the document table so that the header row appears in bold type.

Code Notes
  • The comments in the code shown in the Pastebin iframe above are worth reading to understand how it performs its function.
  • The source code has also been passed though JSLint to check for poor formatting and general sloppiness! I would like to re-emphasize once again how important it is to subject code to JSLint or an equivalent tool. I wrote a piece before on JSLint that might be worth reviewing
  • The "use strict"; line is inserted at the top of each of the three functions to disallow problematic JavaScript usage. This is an ECMAScript 5 enhancement.
  • Apart from function writeTableToDocument(), there is nothing in the other two functions that is new or that has not been covered in earlier blog entries.
  • Function writeTableToDocument() (lines 22-49) is the one to concentrate on. It takes as parameters a string name and a nested JavaScript array as returned by the Range getValues() function. It creates a document with the name given in the parameter using the DocumentApp method create() (line 40). DocumentApp is analogous to SpreadsheetApp and has many analogous methods for doing things like getting the active document.
  • It then adds an instance of the type Table to the document  body and copies the input array to the table (line 41).
  • The Body method call appendTable() on line 41 manages to both copy the input array data into the new Table instance while also returning that same table instance. This is very convenient as it allows the Table instance to be further manipulated in the code lines that follow. 
  • The code then uses the newly created instance to extract the first row as an object of type TableRow and uses its setAttributes() method to set the font type to bold (lines 42-44).
  • Note how a styles object is created and used in this function to apply document styles. This same pattern is used to format document text and can be used for headings and paragraphs.
  • The two functions used by main() in the code example each check their input arguments and they each throw errors if the input arguments are not of the correct number and type. These checks might seem excessive but because JavaScript functions are so "relaxed" about the number and types of arguments, these types of checks are required to make the code as robust as possible.
  • Perhaps surprisingly, when the code is re-executed, a new file with the same name appears in Google Drive. The old file is not overwritten nor is any error/exception thrown. This type of behaviour differs from that seen in standard Unix and Windows environments.

Summary
This entry and the code presented here is only a taster for what can be achieved by using the scripts written in one application to manipulate another. Since Google Documents now supports Google Apps Script, we could have performed the same task using a script hosted by Documents where it would create a SpreadsheetApp instance and use it to access and copy a named range.  Later postings will again explore this type of cross-application scripting. Meanwhile keep an eye on my Leanpub book which will have a chapter or two on this topic.


Saturday, May 25, 2013

Html Service Basics - Write Form Data To A Spreadsheet

Update 14-Sep-2015: See chapter 7 in  leanpub book This chapter has been fully re-written and expanded
As stated in the previous post, GUI Builder is being phased out by Google and Html Service is now the preferred method for building user interfaces for Google Apps Script.

An earlier blog entry entitled GUI Basics - Write Form Data To A Spreadsheet showed how to build a very simple GUI composed of just two text boxes and two buttons that allowed the user to enter values into the text boxes and then add them to the spreadsheet using a button click.  The code to produce the GUI and implement such trivial functionality was quite lengthy and required some detailed explanation. 

I have now started looking at Google Html Service in some detail and, as part of this learning process, I have used it to re-implement the GUI built in the earlier post.  My verdict so far: I really like it!
  • There's a lot less Google Apps Script code.
  • It uses HTML and feels very like plain old web development.
  • The presentation in the HTML is separated from the Google Apps Script code that works with the spreadsheet.
  • I'll be able to practice and use CSS and really cool libraries like JQuery.

Code To Build The GUI

Firstly, we need a file for the HTML. In the Script Editor, select the menu sequence File->New->Html file.  I named the file "index.html" (a conventional name for the default page in web development).
The file contains the following HTML and JavaScript code:
    <b>Add Row To Spreadsheet</b><br />

    <form>
    First name: <input id="firstname" name="firstName" type="text" />

    Last name: <input id="lastname" name="lastName" type="text" />

   <input onclick="formSubmit()" type="button" value="Add Row" />
   <input onclick="google.script.host.close()" type="button" value="Exit" />
   </form>
  <script type="text/javascript">
        function formSubmit() {
            google.script.run.getValuesFromForm(document.forms[0]);
        }
    </script>

Standard Google Apps Script is then written, here's the code to produce a GUI similar to the one constructed in the earlier posting:

function demoHtmlServices() {
  var ss = SpreadsheetApp.getActiveSpreadsheet(),
      html = HtmlService.createHtmlOutputFromFile('index');
  ss.show(html);
}

//getValuesFromForm
function getValuesFromForm(form){
  var firstName = form.firstName,
      lastName = form.lastName,
      sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
  sheet.appendRow([firstName, lastName]);
}


To see the data input form, execute the function  demoHtmlServices() and switch to the spreadsheet.  The following screenshot shows what is displayed

To see the form in action, add text values to the text boxes and click the button labelled "Add Row" and you should see your form input appear on the active sheet.

Code Notes
  • This data entry form requires a lot less code than the previous GUI version that used UiApp but functionally it is equivalent.
  • The HTML code is standard except for two non-standard features in the embedded JavaScript:
    1. The mechanism used to link the Google Apps Script function getValuesFromForm() to the form formSubmit() action.  This link is made by the line google.script.run.getValuesFromForm(document.forms[0]);
    2. The action for the close button google.script.host.close()
  • The Google Apps Script function getValuesFromForm() takes a form instance as an argument and uses the name attributes of the text inputs to extract the entered values that it then writes to the sheet.

I did not attempt to discuss HTML here because there are so many books and on-line tutorials available that there really is no need to do so.  One on-line source that I use is the W3C site.

This blog entry is just a very basic introduction to the Google Html Service.  I'll post lots more in the future and will also put at least one chapter on this topic into my work-in-progress book. Having just played with Html Service for an hour or so, I'm really excited about it and I feel it brings a lot to Google Spreadsheet programming and now makes it look and feel a lot like web development.  There is therefore the added incentive of gaining very useful transferable skills.

 See also these later blog entries for more on Html Service

Update 3-Apr-2015: See a recent blog entry for a practical GUI example
Update 11-Apr-2015: See another practical GUI example in this blog entry



Thursday, May 23, 2013

GUI Builder Out, Html Service In

It is proving difficult to keep up with all the changes in Google Apps.  I suppose in some ways this is a good thing because it shows that the whole area is moving forward.  However, it complicates matters when you're trying to write about it.  One change of immediate significance is that GUI Builder is being phased out. Quote "The GUI Builder will continue to be available until September 9, 2013. After that point, you will not be able to create or manage GUI Builder components, although existing components will still function."

The Html Service is the wave of the future so I'm planning to read up on this and I'll do a posting on it soon.  One real positive I've taken from the Google link above is that Html Service can be used with JQuery and JQuery UI.  That's a real boon to Google Apps developers in that the skills gained will be transferable to wider web development.

Another link with some more information.

Wednesday, May 22, 2013

Latest From Google - Google Apps Scripts In Docs

I'm preparing a blog entry on manipulation of Google Docs from Spreadsheets and have just noticed that you can now write scripts hosted in Docs just as in Spreadsheets

That entry was date 14th May this year.

Exciting times!

I'll do my blog posting as planned but this Google announcement is a significant development.


Saturday, May 11, 2013

VBA and Google Apps Script Side-By-Side


Appendix A, provisionally entitled  "Appendix A: Excel VBA And Google Apps Script Comparison"
has been added to my book Google Spreadsheet Programming

Download the free sample which includes the appendix covering common spreadsheet programming tasks in both VBA and Google Apps Script.


Please send me any comments or suggestions that you have so that I can add to and improve the material it contains.  More content will be added in due time.

Some Notes On Google Apps Scripts and VBA


The most common type of request that I receive from readers is how to translate Excel VBA into Google Apps Script.  Google Spreadsheets implements most of the standard spreadsheet functions in addition to a few of its own.  However, from a programming perspective Excel and Google Spreadsheets are quite different:

  • VBA is the last survivor of the old pre-VB.NET VB lineage with the main branch ending at VB6.  Google Apps Script is a JavaScript dialect with a very different syntax and philosophy from VBA.  Use of semi-colons to end statements, curly braces to delimit code blocks, and case sensitivity are just a few of the more obvious ways in which Google Apps Script differs from VBA.
  • Google Apps Script, being a JavaScript dialect, comes with some really useful productivity enhancers such as functions as first class objects, very flexible data structures in the form of its arrays and objects, and built-in regular expressions.  See Douglas Crockford's very well-known O'Reilly title JavaScript The Good Parts for all the details.

However, both languages are hosted within applications and are used as the "glue" to integrate applications.  Their APIs are therefore quite similar in philosophy if not in detail.

Check out the PDF and let me know what you think!

Update 22-May-2013
I'm getting positive feedback on this blog entry so there is clearly some interest in this. Accordingly, I will continue to update the PDF with additional code examples.  Suggestions or examples are also very welcome!

Wednesday, February 20, 2013

GUI Basics - Write Form Data To A Spreadsheet

Note Added 2013-06-02: The GUI Builder is being phased out, Html Service is now the preferred GUI tool, to see how it used to replicate the functionality developed here  see blog entry on this.

(Added 17th May 2013): Simple GUI creation is described in much greater detail with additional examples in my leanpub book "Google Spreadsheet Programming" in chapter 7 .

Learn How To:
  • Create a GUI form without the GUI Builder
  • Create and configure textbox, label and button widgets
  • Create handlers
  • Link handlers to widgets and callbacks
  • Write the form text box input to a spreadsheet

Two earlier blog entries covered use of the GUI Builder for construction of a simple user interface:
  1. Blog Entry 1
  2. Blog Entry 2

The GUI Builder is very convenient. However, to really learn how to build GUIs in GAS there is no substitute for hand-coding the application.  This entry builds a really simple GUI to demonstrate how to:
  • Create and place GUI components -widgets- on the user interface (what is known as a form in VB).
  • Associate callbacks with widget actions such as button-clicking.
  • Extract values from text box widgets.
  • Write text input values to a spreadsheet.

The code example is quite long so it has not been pasted in here but it can be retrieved from this GitHub link.

As an experiment, I have put the code in Pastebin so it can be viewed here in an IFrame.  I tried to do this with the GitHub page but the IFrame remains empty.



Here is the code as formatted by Pastebin:

 

To run the code, copy it from above or from Github and paste it into the Script Editor. Then execute the function guiDemo(). Switch to the spreadsheet to see the GUI, add some text to the text boxes and click the button labelled "Add Row". You will then see your input written to the spreadsheet.

The GUI should display like this:


To close it, simply click the button labelled "Exit".


It is probably easier to follow along if you paste the code into a Script Editor.

Code Notes
  • The GUI is built by the function guiDemo() (defined in lines 32 - 74).
  • The GUI resides in the spreadsheet and cannot be run standalone in its current for.
  • Three lines are common to all of these spreadsheet-hosted GUIs in GAS:
    1. Create a UiApp instance (line 34).
    2. Create a SpreadsheetApp instance (line(35).
    3. Attach the UiApp instance to the SpreadsheetApp instance using the latter's show() method (line 73).
  • The GUI layout is achieved by adding so-called horizontal panels to a top-level vertical panel.
  • The label-text box pairs are then added to the horizontal panels and their order and their containing-panel order is determined by the sequence of calls to the panel add() method.
  • The button click events do the real work of taking text box input and writing it to the spreadsheet.
  • The button click events are linked to handler function as follows (taking the "Add Row" button as an example:
    • The handler object is created: addRowHandler = ui.createServerHandler('addRow'); passing it the function name as an argument. The function name given is that of the event handler (aka callback).
    • The click event is associated with the handler: btnAddRow.addClickHandler(addRowHandler);
    • Other widgets that need to be involved when the event fires need to be registered (lines 65 and 66 that call the handler object method addCallbackElement).
  • When the event handler is called, it takes the event object as an argument (named "e" in this example).
  • In order for the event to be able to access the text box values entered by the user, the text box name must have been assigned when the GUI was created. This was done in lines 67 and 68.
  • In order for the event to be able to alter a widget, e.g. re-set text box contents to an empty string, it needs an instance of the widget. To get the widget instance, the widget ID must have been set when the GUI was created. This was done in lines 69 and 70.

Some GAS GUI "Gotchas"
I've sometimes found that the GUI displays without error but does not do as I want. Here are a few of the reasons for this that I've encountered:
  • Forgetting to call the add() method on the panel or form so the widget never appears.
  • Confusing widget IDs and names. They are required for different things, see above!
  • Trying to change a GUI property in the event handler but then forgetting to return the UiApp instance (line 28 in the example).
  • Forgetting to register other widgets with the handler object (try commenting out lines 65 and 66 to see the effect).

This is a very simple example but it is worth playing with and extending to learn more.  More detailed examples and discussions are given in my book leanpub book in chapter 7 (just published)


Friday, February 8, 2013

MySQL Re-Visited - Metadata

Set Up a Free Cloud MySQL Instance 

I used Xeround.  Other options include Google and Amazon and many others.
Install the mysql command line client on own machine.
Log in to Xeround MySQL cloud instance using local mysql client.
Download and unzip the world.sql file
While logged in to Cloud mysql database instance from local machine:
-- Login looks something like this for Xeround
$ mysql --user=tigger --host=instance<some number>.db.xeround.com --port=<portno> --password
-- Create a database
mysql> CREATE DATABASE world;
-- Switch to the newly created database ,
mysql> USE DATABASE world;
-- Populate the database,
mysql> SOURCE /<path_to_sql_file>/world.sql; 

Once the above SQL script is executed, the database is populated.

Extracting Metadata From The Database

Important metadata can be retrieved using mysql commands but this entry is concerned with how JDBC can be used to obtain the same information. Important metadata about MySQL itself, tables, views, and stored procedures can be retrieved from the JdbcConnection object using the getMetaData() method which returns a JdbcDatabaseMetaData object. This object can be interrogated via its methods to retrieve database metadata.

Function To Print Metadata To The Log
function printMySQLMetadata() {
  var conn = Jdbc.getConnection('jdbc:mysql://instance<num>.db.xeround.com:<port>/world', <usename>, <pwd>),
      dbMetadata = conn.getMetaData(),
      rsTables,
      rsColumns,
      tableNames = [],
      i,
      tableCount;
  Logger.log('Major Version:' + dbMetadata.getDatabaseMajorVersion());
  Logger.log('Minor Version: ' + dbMetadata.getDatabaseMinorVersion());
  Logger.log('Product Name: ' + dbMetadata.getDatabaseProductName());
  Logger.log('Product Version: ' + dbMetadata.getDatabaseProductVersion());
  Logger.log('Supports transactions: ' + dbMetadata.supportsTransactions());
 
  rsTables = dbMetadata.getTables(null, null, null, ['TABLE']);
  while (rsTables.next()) {
    tableNames.push(rsTables.getString(3));
  }
  tableCount = tableNames.length;
  Logger.log('The Table Names And Their Columns Are:');
  for (i = 0; i < tableCount; i += 1) {
    rsColumns = dbMetadata.getColumns(null, null, tableNames[i], null);
    while (rsColumns.next()) {
      Logger.log(tableNames[i] + ':' + rsColumns.getString(4));
    }
  }
  conn.close();
}


Code Notes

  • Establish a connection to the database (reader will need to fill in the details in the connection string to match their own database).
  • Return the metadata as a JdbcDatabaseMetaData object by calling the getMetaData() method of the Connection object.
  • Call various JdbcDatabaseMetaData object methods to retrieve metadata and print the output to the log.
  • Surprisingly, the method supportTransactrions() reports true even though the tables are MyISAM type.  I assumed you needed InnoDB table types to get this feature in MySQL.  I'll check that the tables do indeed support transactions as this code suggests.
  • A list of tables is extracted using the record set (Recordset) returned by the method getTables() (record sets were covered in the earlier blog entries).  The three nulls are required in this method call.  The table type argument has to be passed as an array even though it has only one element.  Other table types that could be passed includes views.
  • The column index argument used in the record set getString() method to extract the table name is 1-based.  The argument given, that is 3, was discovered by trial-and-error.   The table names are then stored in an array.
  • The column names for each table are extracted by processing the columns record set returned by the getColumns() method with the table name as an argument.  Once again, three nulls are required in this method call.
  • The column names are in the fourth column of the record set (argument = 4)  in the columns record set getString() method call. Once again, this was found by trial-and-error.
  • Once the above function has executed, go to  Views->Logs... in the Script Editor to see the output which should begin something like this:
Major Version:5
Minor Version: 1
Product Name: MySQL
Product Version: 5.1.42
Supports transactions: true
The Table Names And Their Columns Are:
city:ID
city:Name
city:CountryCode
city:District


This was just an introduction to what is a large and important area.  I will write a full chapter on Google Spreadsheets/JDBC in this leanpub book in the coming weeks.  So if interested, keep a watch on that.  I'll also post more in future blogs.









Wednesday, February 6, 2013

Manage Google Drive Files and Folders

Update August 8th 2013: There is chapter 8 in this book that covers Google Drive files and folders extensively and their manipulation using Google Apps Script.

 

Introduction

You can manage your Google Drive files and folders using Google Apps Script.  The top-level "class" which implements this functionality is called DocsList.  The class exposes a large number of methods that can be used to manage Google Drive folders and files.  To see a full list, create a new spreadsheet and go to the Script Editor and type DocsList followed by period (".").

In addition to showing how to manipulate DocsList, the code given here also provides an example of how structured exception handling can be used to test for a specific error and respond appropriately depending on the error.

The code below demonstrates some of the main methods provided by DocsList.  To use, create a new spreadsheet and paste the code into the Script Editor. The code has been subjected to JSLint and is available here on GitHub.

/*global DocsList: false */

// Manage Google Drive Folders.
// Check if the given folder name exists.
// If yes, return the folder object for the pre-existing folder.
// If not, create the folder using the given folder name.
// Return the folder object for the newly created folder with the given folder name.
function createFolder(folderName) {
    'use strict';
    var folder,
        errMsg = 'Cannot find folder';
    try {
        folder = DocsList.getFolder(folderName);
    } catch (Error) {
        if (Error.message.indexOf(errMsg) > -1) {
            folder = DocsList.createFolder(folderName);
        } else {
            throw {'name': Error,
                   'message': 'Unable to create folder ' + folderName};
        }
    }
    return folder;
}
// Manage Google Drive files in folders.
// Add files of file type 'spreadsheet' to folder called 'spreadsheets'
// Add files of file type 'document' to folder called 'documents'.
// Leave all other file types alone.
function arrangeDocsToFolders() {
    'use strict';
    var ssFolder = createFolder('spreadsheets'),
        docFolder = createFolder('documents'),
        allFiles = DocsList.getAllFiles(),
        allFilesCount = allFiles.length,
        i,
        rootFolder = DocsList.getRootFolder();
    for (i = 0; i < allFilesCount; i += 1) {
        if (allFiles[i].getFileType().toString() === 'spreadsheet') {
            allFiles[i].addToFolder(ssFolder);
            allFiles[i].removeFromFolder(rootFolder);
        } else if (allFiles[i].getFileType().toString() === 'document') {
            allFiles[i].addToFolder(docFolder);
            allFiles[i].removeFromFolder(rootFolder);
        }
    }
}


The function to execute is "arrangeDocsToFolders()".  However, before executing any code, first beware that the code does the following:
  • It checks to see if two folders called "spreadsheets" and "documents" already exist in the user's Google Drive.  If not, it creates them.
  • It then gets a list of all files in the user's Google Drive, loops over them, and determines their file type. File types "spreadsheet" and "document" are added to the folders "spreadsheets" and "documents", respectively.
  • Execution of the indicated function may require "authorization".
  • Google Drive is different from a standard Windows/Unix directory system in a number of respects.  Firstly, the folders created are organizational rather than physical.  Secondly, files can also belong to multiple folders.  Lastly, there can be more than one folder with the same name in the same Google Drive!
  • The code removes files from the so-called root folder if they have been copied into either the "documents" or "spreadsheet" folders.

Code Notes:

These notes use line numbers so to follow along, view the code in the Script Editor or in GitHub to follow along.
  • The function createFolder() takes a folder name as an argument and uses this argument to get a folder object (line 13). It assigns the folder object to the variable folder.
  • This call is wrapped in a try - catch structure. If the given folder name does not exist, the catch part (lines 14 - 20) checks the error message and if this message indicates that the given folder does not exist, it creates the folder and stores the object returned by  DocsList.createFolder in the variable folder. Otherwise, if it cannot create the folder, it throws an error.
  • The Folder object is returned (line 22).
  • Function "arrangeDocsToFolders()" calls "createFolder()" to either create new folders and return them as Folder objects or else just return the Folder objects for the pre-existing folder names.
  • It then loops over the array of File objects returned by the DocsList.getAllFiles() and tests each element to determine if it is a spreadsheet or a document.
  • The toString() method calls on lines 37 and 39 are required for the string equality test.  It took me a little time to determine why the return values from the File method getFileType() were not behaving as expected.  I assumed they were strings but they are in fact objects.
  • The File method addToFolder() is then called with a Folder object as an argument (lines 38 and 41).  Initially, I tried this code using the folder name string.  The code ran without error but did not do anything.  I would have expected an error but not so!
  • Lines 39 and 42 remove files from the root folder that have been added to the target folders.
Re-running the code has no effect.  Once files have been added to a folder, they are not re-added and  no error is generated if the addToFolder() method is called for a file already in the folder.  However, with folders, as mentioned earlier, multiple folders can have the same name in Google Drive and that's why the code checks if they already exist before it tries to make a new one.

Generating A JavaScript Object Mapping File Names To File IDs

Added Content 2013-02-20: I have added a function to the GitHub code, name getFileNameIdMap() that when executed returns a JavaScript object literal that maps the file names to the file IDs for all files in a user's Google Drive.  File IDs are useful because many GAS methods use the ID to manipulate files.  The IDs are less user-friendly than the file names so this function can be used to generate the desired mapping and calling code can then query the returned object using file names to retrieve the corresponding file IDs.

Here is the code:
// Loop through all the user's Google Drive files and
// return a JavaScript object literal mapping the file names to
// the file IDs.
function getFileNameIdMap() {
    var files = DocsList.getAllFiles(),
        fileNameIdMap = {},
        i,
        fileCount = files.length;
    for (i = 0; i< fileCount; i += 1) {
        fileNameIdMap[files[i].getName()] = files[i].getId();
    }
    return fileNameIdMap;
}


There are many other DocsList, File, and Folder methods that have not been covered here but this topic will be re-visited in future blog entries.

In time, the topics discussed here will be discussed further in this Leanpub book.

Monday, February 4, 2013

Google Spreadsheet Programming - Book On Leanpub

UPDATE 5_AUGUST-2013
 Added Chapter 8 on Google Drive Files, Folders and Permissions

Added content to appendix A on VBA Google Apps Script. New examples and corrected formatting, spelling and added some more notes. This appendix is part of the free sample and can be downloaded from the leanpub book site.

Regarding the MySQL chapter, Xeround that was used for the examples has gone out of business.  Will revise that chapter using Google Cloud SQL this time round.

Preparing Chapter 9 on Email and Calendars, hope to post a draft soon.

Trying different covers!


UPDATE 14-MAY-2013
Added a chapter on Google Apps Script-JDBC-MySQL
Added an appendix with side-by-side examples in Excel VBA and Google Apps Script


UPDATE 18-FEB-2013:
Two new chapters have been added to the book.  The cover Spreadsheet, Sheet and Range objects.  There are also lots of practical examples for these objects, see:

See details here

I am now working on chapter 6 which is on GUIs.  I'll also post some new blog entries on this topic as I prepare the chapter.


04-FEB2013

I have decided to try publishing a book on Google Spreadsheet programming.  There are just three chapters at the moment (~20%) of the total and this is just a rough draft.  The content so far is all free see Leanpub.


I will continue to update this blog and post entries in parallel with adding to the book.  I would greatly appreciate feedback on the contents. Negative feedback is just as useful to me as the positive variety!

The next two chapters are in preparation and will  cover the SpreadsheetApp, Spreadsheet, and Range objects.  I plan to add these in the next week or so.  The style is very much example-driven and all the examples will be subjected to JSLint and published on GitHub.