Monday, July 30, 2012

Connecting To a MySQL Database In "The Cloud" (Part 2)

 For a detailed discussion with worked examples on Google Apps Script - JDBC - MySQl, see chapter 6 in this book on Leanpub


2013-02-08: There is another blog entry on this here!


Update added 2013-01-24: All the code  here is still valid but the MySQL instance that it points to has disappeared (for some reason?).  Suggestion: Set up your own free instance with Xeround or one of the other companies, including Google, who offer cloud MySQL instances.

In my last posting, I showed how to connect to a MySQL database in the cloud from a Google App script using JDBC.  In my closing remarks I noted that this is a vast field with vast potential to match.  I have put together here a few more scripts that will hopefully highlight some of this potential.  The scripts given here use the database described in the previous post.  The main points covered here are:

  • Execution of an SQL SELECT statement and printing of its output
  • Retrieval of an SQL statement from a spreadsheet cell comment
  • Use of bind parameters in a SELECT statement


A Very Brief Overview of Elementary JDBC
Java Database Connectivity (JDBC) is the standard Java API for connecting clients to databases.  Google appear to have done an excellent job in making this API available via Google App scripting.  Database APIs for various languages all follow a fairly similar design philosophy and all try to provide a standard interface for different databases (Oracle, MySQL, etc).  Anyone who has used ADO in Visual Basic or DBI in Perl, for example, should not have too many difficulties with JDBC, at least with basic stuff, the API is very big.  The first and most important step is to establish a connection to the target database, the last post showed how to do this for a MySQL cloud database where a Connection instance called conn was created.  The Connection instance methods "createQuery()" and "prepareStatement()" can then be used to create a Statement or a PreparedStatement instance, respectively.  A ResultSet instance is   returned either by executing the "executeQuery()" method of the Statement instance or by binding a value or series of values to the PreparedStatement instance and then executing the PreparedStatement instance's "executeQuery()" method.  Once the RecordSet instance has been returned, we can loop over it to return the query rows for each call to its "next()" method and/or we can call some of its other methods such as "getMetaData()" to get column information.  Once the RecordSet instance is processed, it is, according to Google's documentation, important to call the "close()" methods defined for instances of Connection, Statement, PreparedStatement, and RecordSet.  This explicit closing is considered good database programming practice for JDBC and similar APIs.  The PreparedStatement permits bind variables, the use of which is important for avoiding "hard parsing" of SQL and for prevention of SQL injection.  Creating custom SQL SELECT, DELETE, or UPDATE statements by string concatenation is a definite no-no when it comes to database programming, ALWAYS use bind parameters for both query efficiency and database security!

Code
A:  Executing a simple SQL SELECT using A read-only Account

// Establish a connection to the MySQL cloud database using a read-only account and assign the Connection instance to the variable "conn".
// Create a Statement instance and assign it to variable "stmt"
// Execute the Statement instance "stmt" by calling its "executeQuery()" method with a simple SQL statement string as its sole argument.
// The returned RecordSet instance "rs" is processed in a "while" loop and the rows are written to the Logger.
// Important clean up by calling "close()"
function printPubmedIds() {
  var conn = Jdbc.getConnection("jdbc:mysql://instance22444.db.xeround.com:12775/pubmed_publication", "read_only", "readonly");
  var stmt = conn.createStatement();
  var rs = stmt.executeQuery("SELECT pubmed_id FROM article");
  
  while(rs.next()){
    Logger.log(rs.getLong(1));
  }
  rs.close();
  stmt.close();
  conn.close();
}

B: Get all columns from table "article" for a given pubmed_id
Using a read-only account
// Executes "getArticle()" by passing a pubmed ID as an argument.
function call_getArticle() {
  var pubmed_id = 18252227;
  getArticle(pubmed_id);
}

// Retrieve all columns from the table "article" for a given pubmed ID.
// Expects the SQL to be execute to be in a comment in cell A1 of a sheet called "sql"
// Note the code that gets the SQL from a cell comment in the call to function "getSqlFromComment()"
// Note also the two consecutive lines that create a PreparedStatement instance and that bind a value to the prepared statement
function getArticle(pubmedID) {
  var sql = getSqlFromComment('sql', 'A1');
  var conn = Jdbc.getConnection("jdbc:mysql://instance22444.db.xeround.com:12775/pubmed_publication", "read_only", "readonly");
  var prepStmt = conn.prepareStatement(sql);
  prepStmt.setInt(1, pubmedID);
  var rs = prepStmt.executeQuery();
  var columnCount = rs.getMetaData().getColumnCount();
  var columnName;
  
  while(rs.next()) {
    for( var i = 1; i <= columnCount; i++ ){
      columnName = rs.getMetaData().getColumnName(i);
      Logger.log(columnName + ':\t' + rs.getString(i));
    }

  }
  rs.close();
  prepStmt.close();
  conn.close();
}


// Given a sheet name and a cell address, return the comment text for that cell in that sheet.
// Check for two arguments and throw an error if two arguments have not been passed.
function getSqlFromComment(sheetName, cellAddress) {
  if(arguments.length !== 2) {
    throw('Function "getCellComment" expects two arguments only ' + arguments.length + ' given!');
  }
  try {
    var ss = SpreadsheetApp.getActiveSpreadsheet();
    var ws = ss.getSheetByName(sheetName);
    var commentedCell = ws.getRange(cellAddress);
    var commentText = commentedCell.getComment();
    return commentText;
  } catch(error) {
    throw('Unable to get comment for sheet: "' + sheetName + '" cell address: "' + cellAddress + '"');
  }
  
}

Code Notes
  • The code in section A should not require any further explanation other than that already given in the code comments.
  • Section B introduces a few additional topics, namely: retrieving SQL from a cell comment and the use of bind variable.
  • To run, insert a sheet called "sql" into the active spreadsheet.  Add a comment to its cell A1 that contains the following SQL:
SELECT
  ar.*
FROM
  article ar
WHERE
  ar.pubmed_id = ?
  • Paste the code into the Script Editor After executing function "call_getArticle()", check the Log (
  • View->Logs menu in the Script Editor) to see the output.
  • The code to get the cell comment was discussed in an earlier post.  The important point to note in the SQL is the "?", it is this character that is replaced by the actual value in the line "prepStmt.setInt(1, pubmedID);" in function "getArticle()" and exemplifies the use of bind variables.
  • Placing the SQL in a spreadsheet cell comment may seem like an odd thing to do but, from past experience with Excel VBA/ADO database programming, I have found it to work well.  Creation of formatted multi-line strings is difficult in both VBA and JavaScript, we have no handy triple quoting as in Python or heredocs and useful string operators such as qq in Perl that can be used to embed pre-formatted SQL statements in scripts.  Spreadsheet cell comments can be used to retain the SQL formatting and to hide the SQL from users.
  • These are demo scripts so important code for exception handling has been omitted. Also, the code would be much more re-usable if it were packaged up in a JavaScript object.  I'll have something more to say about this again in a future post.
  • Lastly, I've left all the connection details for this read-only account in the code snippets given above so anyone should be able to run them  Word of caution: Xeround allow only five connections for these free databases so please ensure that the connections are closed as quickly as possible, otherwise the account will become unusable for everyone.


Friday, July 27, 2012

Connecting To a MySQL Database In "The Cloud"

Detailed worked examples on Google Apps Script-JDBC/MySQL chapter 6 in this book on Leanpub

2013-02-08: There is another blog entry on this here!
 This goes through the process of setting up a MySQL database in the cloud and re-capitulates some of the content here.


Watch this leanpub book for a chapter on Google Spreadsheets/MySQL



Update added 2013-0124: All the code  here is still valid but the MySQL instance that it points to has disappeared (for some reason?).  Suggestion: Set up your own free instance with Xeround or one of the other companies, including Google, who offer cloud MySQL instances.

Cloud computing is a current buzz term beloved by IT managers but when we use products such as gmail we are in using the "cloud" and when we do Google App scripting with JavaScript, we are programming in the "cloud".  Spreadsheets generally are excellent data analysis tools but are not good data storage mechanisms though they are often co-opted into this role.  Plugging a spreadsheet into a relational database management system (RDBMS) gives us the best of both worlds in that we have a secure system built on the sound fundamentals of the relational model coupled with an excellent data analysis and data presentation tool. But we need a mechanism to connect the two; In Excel we can use VBA and ADO to connect to any relational database for which we have a suitable database driver.  JavaScript and JDBC can be used in an analogous manner to connect Google spreadsheets to cloud databases.  This blog post provides an introduction on how to this. 


The steps are as follows:
  • Create an account with a provider of cloud database services, evaluation accounts are generally free.  For this example, I have used xeround to create an account for a free cloud MySQL database.
  • After creating the account, you need to log in to create and populate an actual database.  You can use the mysql command line client or a fancier GUI style application to do this.  My personal favorite is Sequel Pro on the Mac but the standard MySQL Workbench is fine too.  You can also access the phpAdmin application from the Xeround website.
  • The account allocated by Xeround gives its owner full admin permissions to create and drop database objects.  I used this account to create additional accounts with fewer privileges, one such account is the "read_only" used below.  It is always a good policy to use an account with the minimum privileges needed so if you only need to retrieve data, then a "read only" account should be used to avoid inadvertent updates, deletes etc.


The test database contains just four tables that store information on a random selection of 10 published medical papers retrieved from the NCBI PubMed resource (Bethesda, MD, USA).  Briefly the data model is as follows:

  • The pubmed_id is the ID issued by PubMed to uniquely identify a published article and is the primary key for the table "article".  
  • An article can have one or more authors and an author can be listed for one or more papers (many-to-many).  The "author_article" table is used to establish this relationship.
  • An article is associated with one or more Medical Subject Headings, these are collected in the table "article_keyword".

The schema diagram should help to clarify this (database name: pubmed_publication)



Here is the code to connect to my cloud MySQL Database:

// Connect to a cloud MySQL database using a readonly account.
// Execute the MySQL command "SHOW TABLES" to get a listing of the table names in the target database.
// Loop through the results set (rs) returned from the database and write the extracted values to the active spreadsheet.
// Perform cleanup by closing objects (important that this is done)
function connectToMySql() {
  var conn = Jdbc.getConnection("jdbc:mysql://instance22444.db.xeround.com:12775/pubmed_publication", "account_name", "account_password");
  var stmt = conn.createStatement();
  var rs = stmt.executeQuery("SHOW TABLES");
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var cell = ss.getRange('A1');
  var row = 0;
  while(rs.next()) {
    cell.offset(row, 0).setValue(rs.getString(1));
    row++;
  }
  rs.close();
  stmt.close();
  conn.close();
}

The code comments should clarify what this simple function is doing.

  • My Xeround account uses the free or "evaluation" package so, unsurprisingly, there are restrictions.  Firstly, the database can have only five connections open at any one time.  Secondly, its size is limited to 10KB.  Additionally, although I created the tables as type InnoDB because I wanted the referential integrity conferred by foreign key support, they are converted to xeround types which do not support foreign keys which is slightly annoying.
  • To run, email me at mick@javascript-spreadsheet-programming.com and I'll send the account name and account password for the readonly account.  Alternatively, you can create your own cloud MySQL and use that instead, I'm happy to send the source data and SQL to make and populate the database also.  I don't mind sharing this database as it is for testing only, I'm just concerned about exceeding the connection limit thereby making it unusable.
  • The new stuff here is the use of JDBC.  JDBC is, in my opinion, an excellent choice for this task since it is widely used in the Java world and is well documented.




This is a very brief introduction into a vast area that is full of potential.  We can combine knowledge of Google spreadsheets, JavaScript, JDBC, and SQL to create very powerful applications that access data in the cloud.  Using an RDBMS offers us data security and scalability, concurrency, etc so they are a perfect back end for spreadsheets.  I am more familiar with PostgreSQL than MySQL and would prefer to use it but, thus far, Google Apps do not appear to provide a JDBC driver for it.  Having said that, MySQL is a good alternative and probably has a bigger user base so I won't complain.  I'll post more on this in the future.

Meanwhile, check out the Google documentation on JDBC!

Thursday, July 26, 2012

Setting And Getting Cell Comments - Spreadsheet Metadata

Comments can be added to spreadsheet cells both manually and programmatically to document a spreadsheet.  This is especially important for complex template-type spreadsheets where users are expected to add values to certain cells to perform calculations or simulations. Comments are possibly an under-used and under-appreciated aspect of spreadsheet usage.  They are an excellent documentation mechanism in that they can add metadata (data about data) and can be used to provide program input just as easily as actual cell values can.  The functions below demonstrate how to set comments on cells, retrieve comments from cells, and check cells for comments.  All the code below is very straightforward and easily deployed for any spreadsheet. When a comment is added, its presence in a cell is indicated by a small orange triangle in the upper righthand side corner of the commented cell.  The comment itself becomes visible when you move the cursor over the cell.  A nice aspect of cell comments is that they preserve white space so formatting is retained, this is useful when, for example, storing and viewing SQL statements.  It may seem odd, but in a soon to be published posting, I'll show how I use cell comments as a convenient place to store SQL statements. Unlike Excel, there does not appear to be a way in Google spreadsheets to make the comments permanently visible.

Code

// Driver function to add comments to cells, to retrieve the comment text, and check for comments.
// Writes its output to the Logger.
function test_CellCommenting() {
  var targetCellAddress = 'A1';
  var sheetName = 'Sheet1';
  var commentText = 'Adding a comment to a cell at date/time: ' + new Date();
  
  try {
    setCellComment(sheetName, targetCellAddress, commentText);
    Logger.log(getCellComment(sheetName, targetCellAddress));
  } catch(error) {
    Logger.log(error);
  }

  if( isCellCommented(sheetName, targetCellAddress)) {
     Logger.log('Cell is commented');
  }
}
// Given a sheet name and a cell address, this function checks to the cell and returns "true" if it contains a comment, otherwise, it returns "false".
// Calls "getCellComment()" and throws an exception if the arguments count != 2.
// Can be used to check that there are no pre-existing comments before a new comment is added to prevent clobbering of comments.
function isCellCommented(sheetName, cellAddress) {
  if(arguments.length !== 2) {
    throw('Function "isCellCommented" expects two arguments only ' + arguments.length + ' given!');
  }
  var cellComment = getCellComment(sheetName, cellAddress);
  
  if(cellComment.length > 0) {
    return true;
  } else {
    return false;
  }
  
}

// Taking a sheet name, a cell address, and a comment text as input, this function sets the comment for the given cell address on the given sheet name.
// It checks that the numer of given arguments = 3 and throws an exception if this is not true.
// Returns "true" if the comment has been set, otherwise throws an error.
function setCellComment(sheetName, cellAddress, commentText) {
  if(arguments.length !== 3) {
    throw('Function "setCellComment" expects three arguments only ' + arguments.length + ' given!');
  }
  try {
    var ss = SpreadsheetApp.getActiveSpreadsheet();
    var ws = ss.getSheetByName(sheetName);
    var cellToComment = ws.getRange(cellAddress);
    var comment = cellToComment.setComment(commentText);
    return true;
  } catch(error) {
    throw('Unable to set comment for sheet: "' + sheetName + '" cell address: "' + cellAddress + '"');
  }
  
}

// Given a sheet name and a cell address, return the comment text for that cell in that sheet.
// Check for two arguments and throw an error if two arguments have not been passed.
function getCellComment(sheetName, cellAddress) {
  if(arguments.length !== 2) {
    throw('Function "getCellComment" expects two arguments only ' + arguments.length + ' given!');
  }
  try {
    var ss = SpreadsheetApp.getActiveSpreadsheet();
    var ws = ss.getSheetByName(sheetName);
    var commentedCell = ws.getRange(cellAddress);
    var commentText = commentedCell.getComment();
    return commentText;
  } catch(error) {
    throw('Unable to get comment for sheet: "' + sheetName + '" cell address: "' + cellAddress + '"');
  }
  
}

Code Notes
  • The driver function sets a cell comment for a cell and appends a date stamp to it.
  • The exception handling is added to check that the correct numbers of arguments are passed to each of the functions and exceptions are thrown if bad arguments are passed to any of these functions, a non-existent sheet name, for example.
  • I'll show how cell comments can be a useful mechanism for storing and retrieving SQL statements when discussing Google spreadsheets/JavaScript connection to a cloud MySQL instance.  

Final Point:  Cell comments should always be used for adding metadata to spreadsheets and for documenting spreadsheet templates.



Wednesday, July 25, 2012

Inter-Spreadsheet Collaboration

Google spreadsheets are great for collaborative work since spreadsheets can be shared and edited by groups of users.  The following code examples show how JavaScript can be used to open and edit spreadsheets other than the one in which the actual code resides.  Furthermore, there is an example of how the details of those accessing shared spreadsheets can be logged.  In the code below a spreadsheet owned by one user account is opened and edited by another user, i.e. the one executing the JavaScript.  Although the examples given are simplistic, they do suggest how powerful this type of approach can be.  A user can create a spreadsheet with attached JavaScript code, grant appropriate privileges to other Google accounts so that the code can interact with another spreadsheet to manipulate and edit it.

Here are the code examples:


// This is the calling function
function call_openAnotherSpreadsheet() {
  var spreadsheetID = 'google spreadsheet ID';

  try {
    var spreadsheetFile = openAnotherSpreadsheet(spreadsheetID);
  } catch(error) {
    Logger.log('Unable to open given spreadsheet, check error log');
    return;
  }

  try {
    SpreadsheetApp.setActiveSpreadsheet(spreadsheetFile);
    logOpen(spreadsheetFile, 1);
    Logger.log(SpreadsheetApp.getActive().getName());
    Logger.log(SpreadsheetApp.getActive().getNumSheets());
    Logger.log(SpreadsheetApp.getActive().getDataRange().getA1Notation());
    Logger.log(SpreadsheetApp.getActive().getDataRange().getValues());
  } catch(error) {
    Logger.log('ERROR:' + error);
  }
}


function openAnotherSpreadsheet(spreadsheetID) {
  try {
    return SpreadsheetApp.openById(spreadsheetID);
  } catch(error) {
    Logger.log('Error thrown in "openAnotherSpreadsheet" opening given spreadsheet');
    throw(error);
  }
}


function logOpen(spreadsheet, sheetIndex) {
  var targetSheet = spreadsheet.getSheets()[sheetIndex];
  var timestamp = new Date();
  Logger.log(targetSheet.getName());
  var nextRow = targetSheet.getDataRange().getNumRows() + 1;
  var targetCellAddress = 'A' + nextRow;
  targetSheet.getRange(targetCellAddress).setValue(Session.getEffectiveUser().getEmail());
  targetSheet.getRange(targetCellAddress).offset(0,1).setValue(timestamp);
}

To execute the above, paste the code into the script editor, provide an authentic spreadsheet ID as displayed in the spreadsheet URL, this will be the long alphanumeric string between "https://docs.google.com/spreadsheet/ccc?key=" and the trailing "#gid=0".  This ID can be either the spreadsheet containing the code, another spreadsheet, yours or one for which you've been granted edit permissions.  After executing the code, the script should have logged some information into the Logger, to view, select View->Logs from the script editor and also check the target spreadsheet, the one for which you provided the Google ID.  The log should contains the information written by the code in the second try .. catch block in the the function "call_openAnotherSpreadsheet()" while some details on who executed the script (the owner or someone who is sharing the spreadsheet containing the scripts) are recorded by the call  to "logOpen()".


Some Code Notes

  • JavaScript offers structured exception handling via the Java-inspired try-catch-finally mechanism.  I'll return to this in a later post but such actions as opening another spreadsheet are well suited to being placed in a try block where if an exception is thrown by such an action, the error can be logged and then re-thrown.
  • I've used the Logger.log method for the first time to record output.  This provides a convenient and less intrusive method of recording actions and output than the Browser.msgBox mechanism used in earlier posts.
  • Since, I've been away from JavaScript and Google spreadsheet apps for a while, I've re-visited some methods of the SpreadsheetApp and Range classes in the "call_openAnotherSpreadsheet()".  See the calls to "get" methods of the active spreadsheet class in this function and cross-check these with what has been written to the Logger.log, this should clarify what these method calls are doing.
  • The "logOpen()" function does a few interesting things.  Given a spreadsheet instance and a sheet index, it retrieves an instance of the sheet with that instance, determines the number of used rows and increments it by 1(.getDataRange().getNumRows() + 1) .  It then creates an address for column A (var targetCellAddress = 'A' + nextRow;) and uses this address to record the effective user's (whoever is executing the script) email in column A and a date stamp in column B.  Next time the script is called, it re-uses this logic to determine the next available row.
  • Leaving aside the code details, far and away the most important point in the above examples is the facility with which users can use Google spreadsheets and JavaScript to collaborate!  Scripts can be written by anyone and shared with anyone who has a Google account and these scripts can be used to manipulate any spreadsheets for which the script user has permissions.  All very powerful.


Wednesday, July 4, 2012

Sending E-mail From a Google Spreadsheet

 Update August 8th 2013: There is chapter 9 in this book devoted to email and calendar manipulation using Google Apps Script. This chapter covers the GmailApp object in addition to the more limited MailApp object that is used here.

It’s been a while since the last real post so here is one that answers a reader’s query but also gets me back into writing some JavaScript after a liong layoff that’s left me rusty.

I've had a question recently, in italics, as follows:
do you have any fucntion where when it is called like

=if(A2>"5",CallEmailFunction()
,"Don't Call")

so if CallEmailFunction is called it will send email to user with alert notification

Thanks

Firstly, Google provide a tutorial entitled Tutorial: Sending emails from a Spreadsheet
This will be our starting point but I’ll break it down into smaller functions.  
This is not going to work because we are trying to call a function that flouts the stricture that a user-defined function must return a value but cannot have side-effects where the side-effect here is sending an email.  See the earlier posting on user-defined functions.  Excel is similar in this respect in that we cannot call subroutines as user-defined functions.

Anyway, from the editor, let’s write some code
First up, we need a function that when given the arguments, “email_address, “email_subject”, and “email_text” will do as we ask, let’s call it “sendEmail()”.

// We should do some checks in here such as error handling and argument checking
function sendEmail(email_address, email_subject, email_message) {
 MailApp.sendEmail(email_address, email_subject, email_message);
}

// Write a function to call the one defined above
function test_sendEmail() {
 var addr = whoever@gmail.com';
 var subj = 'test email from Google Spreadsheet';
 var msg = 'Let me know if this came through ok!';

 sendEmail(addr, subj, msg);
}

Try running the function “test_sendEmail()”, after asking for authorisation, it should send the email using the arguments given.



Let’s see if we can call the function  “sendEmail()” from a spreadsheet where the arguments are given in cells:

Address Subject   Message Criterion
email1@gmail.com    testing    email message for you     5
email2@gmail.com    testing    email message for you too     3

Attempting to call the spreadsheet from a cell using this formula:
“=if(D2>=5, sendEmail(A2,B2,C2))”
results in a permissions error:
error: You do not have permission to call sendEmail (line 2)

However, we can achieve the desired functionality by running a calling script from the script editor.


Here are some values in a spreadsheet:

The following code starts at cell A2 and based on the criterion supplied in column D, sends an email to the address in column A.


function sendEmail(email_address, email_subject, email_message) {
  MailApp.sendEmail(email_address, email_subject, email_message);
}


function test_sendEmail() {
  var sheet = SpreadsheetApp.getActiveSheet();
  var cell = sheet.setActiveCell('A2');
  var criterion_cutoff = 5;
  var i = 0;
var addr;
var subj;
var msg;
  do {
    addr = cell.offset(i,0).getValue();
    subj = cell.offset(i,1).getValue();
    msg = cell.offset(i,2).getValue();
    criterion = cell.offset(i,3).getValue();
  
    if(criterion > criterion_cutoff) {
      sendEmail(addr,subj,msg);
      Browser.msgBox('Sending email to: ' + addr);
    }
    i++;
  } while( cell.offset(i, 0).getValue().length > 0 )
    
  Browser.msgBox('Done!');
}

To use just create a spreadsheet like the one shown above, open its script editor and paste in the code for the two functions given above. Save the code and execute the "test_sendEmail()" function.

Some Notes On The Code
  • Declare variables and set a starting cell (var cell = line)
  • Set a counter for the do loop
  • Use the "offset" method of the Range object to retrieve cell values
  • Check the criterion variable and based on its value, send emails by calling the "sendEmail()" function with arguments taken from cell values
  • Increment i so that the code goes to the next row
  • Continue until the "offset" returns an empty cell
  • After not having written any JavaScript for a while, I'd forgotten that "length" is a property and not a method so it took me some time and head-scratching to determine why "length()" calls in the "while" condition were not working!
The above code could be executed from a form and I'll try to cover this in a future post. Meanwhile, check out what looks like some very useful material at Henrique Abreu's formmailer website

Planning a post on regular expressions very soon.