Friday, January 21, 2011

Using JavaScript Objects In Google Spreadsheets

The power of JavaScript objects is belied by the very simple syntax required for their declaration:
var my_object = {};
That is all that is required to create a JavaScript object using the object literal syntax.  There is also a Java-like syntax for creating objects that uses the new operator but, following Douglas Crockford’s advice in his book “JavaScript: The Good Parts” this syntax is not used here.  Apart from making JavaScript look a little more like a conventional object oriented language, it appears to offer no advantages over the object literal syntax.
Objects are simply collections of colon-separated key/value pairs with key-value entries separated from each other by commas.  Objects can be arbitrarily complex storing values ranging from simple scalars to other objects.  Objects can also store functions where functions attached to objects in this way are known as methods.  This storage of data together with the functions that operate on that data is reminiscent of classes in more conventional object oriented languages.
To demonstrate some of the power of objects, consider the following frequently encountered scenario:
  • How can one count the occurrences of distinct cell value in an input range?  
  • A secondary consideration might be how to filter out just those cell values that occur more than a certain number of times in an an input range.

In Excel, this can be done easily with pivot tables while in a database one can employ the COUNT() function in conjunction with a GROUP BY clause.  Scripting languages such as Perl and Python have built-in objects known as hashes and dictionaries, respectively, to perform this sort of task.   Excel VBA can be extended to include a dictionary object by referencing the Microsoft Scripting Runtime library through the Tools|References menu in the VBA editor.
Regardless of language/environment, this is a typical data reduction task and JavaScript objects can be put to work to do it.

Counting Occurrences Of Values
Previously JavaScript arrays were used to produce a unique list from an input range.  This technique relied on the JavaScript indexOf() method to check an array for value membership.  But what if one wants a list of elements and their counts?  Here is one way to achieve this using objects.

To run the following code as is:
  1. Create a Google spreadsheet
  2. Re-name or create a sheet and call it “input”
  3. Add some values to range “A1:A10” of the sheet “input”.  Make sure there are some repeated values to see how the JavaScript code counts them.
  4. Paste the code into the JavaScript editor and run function “test_getCountValues()”


The code adds a new sheet called “output” and writes the output into columns A and B of the is sheet starting in the first cell “A1”.  The code comments should clarify what is going on

// Test for function "getCountValues()".
// Need to define an input and an output range.
// Create an object with the cell value-occurrence counts by calling "getCountValues()".
// Use the enumeration "for .. in loop" to extract each object key and its associated value.
// Write the key and count values to output.
function test_getCountValues() {
   // Define an input range with values of some kind in it.
   var test_range = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("input").getRange("A1:A10");
   // Pass the input range as an argument to function "getCountValues()" and
  // capture its return object.
   var count_values = getCountValues(test_range);
   // Add a new sheet for output and store a reference to the sheet object object.
   var new_sheet = SpreadsheetApp.getActiveSpreadsheet().insertSheet("output");
   // Create a range object of one cell in the new sheet for output.
   var output_range = new_sheet.getRange("A1");
   // Set a row counter for output.
   var row_counter = 0;
   
   // Using the for ... in loop, extract the key/value pairs from the object and
  // write them to output.
   for  ( var key in count_values ) {
       var value_count = count_values[key];
       // Write the unique cell values with their counts.
       output_range.offset(row_counter, 0).setValue(key);
       output_range.offset(row_counter, 1).setValue(value_count);
       // Increment the row counter for the next iteration of the for .. in loop.
       row_counter += 1;
   }

}
 
// Take a range object as a single argument for the parameter named "range_input"
// Return an object where the keys are the unique cell values and their associated
// values are counts of the number of times they occur in the input range.
function getCountValues( range_input ) {
   // Create an object to store the unique cell values and their occurrence count.
   var values_counts = {};
   
   // Process the input range in a nested for loop using the outer and inner counter
   // variables (respectively, i and j) as arguments to the range "getCell()" method.
   for( var i = 1, row_count = range_input.getNumRows(); i <= row_count; i++ ) {
       for( var j = 1, col_count = range_input.getNumColumns(); j <= col_count; j++ ) {
           // Extract the value for a cell with row and column coordinates i and j respectively.
           var cell_value = range_input.getCell(i,j).getValue();
         // Check if this cell value is already in the object:
           if( values_counts[cell_value] ) {
               // If it is increment the value for which it is the key.
               values_counts[cell_value] += 1;
           } else {
               // If it is not in the object, add it, and set they value it is key for to 1.
               values_counts[cell_value] = 1;
           }
       }   
   }
   
   return values_counts;

}

​Code Notes
  • The code above demonstrates a simple but useful application of JavaScript objects to spreadsheet programming.
  • It introduces the for...in loop to enumerate the keys.  The standard for loop is used to iterate over arrays but the for...in loop is required for objects.
  • It is important to realise that JavaScript objects are not dictionaries or hashes in the Perl/Python sense but they do support some of the key characteristics of these data structures.  The important one for this application is the ability to test for membership.
  • The code above only scratches the surface of what JavaScript objects are capable of.  They are extremely flexible and future posts will try to convey their power and applicability to spreadsheet programming.

Monday, January 17, 2011

More On JavaScript Arrays

The previous post concentrated on the range methods getValues() and setValues() that can be used to read range values into JavaScript arrays and return JavaScript array values to ranges, respectively.  But JavaScript arrays are extremely useful objects in their own right and come with a large number of methods.  For spreadsheet programming, Excel VBA arrays appear limited and inflexible in comparison.  When studying JavaScript arrays, it is advisable to use up-to-date sources that cover at least version 1.6 of the language since new and useful methods have been added to them.  Older sources often give example code showing how to test an array for element membership by looping through the array, for example.  But JavaScript arrays since version 1.6 have supported two methods that can perform this useful task, namely, indexOf() and lastIndexOf().  These methods behave similarly to string objects of the same name where they are used to locate sub-strings.   Two other useful methods are map() and filter() that can be used return new lists where map() applies a function to each element and filter() selects elements based on user-specified criteria.  All four of these JavaScript 1.6 array methods indexOf(), lastIndexOf(), map(), and filter(), are available in Google App Scripting.  A few examples showing how these methods can be used to good effect are given below.

Ensure Element Uniqueness In A JavaScript Array
// Demonstration of array methods.
// Use Browser.inputBox() in a do loop to repeatedly prompt for a value to add to an array.
// Only add the value if it does not already exist in the array.
// If it does already exist, display a message to indicate this.
// Exit the loop when the 'cancel' button is pressed.
// Display the list of unique elements.
function noDuplicatesArray() {
   // Declare an empty array to hosld the unique list
   var unique_elements = [];
   var element;
 
   // Set up a loop to repeatly prompt for values.
   do {
       // Capture the text entered to a variable.
       element = Browser.inputBox("No Duplicates List",
                                  "Enter an element or cancel",
                                  Browser.Buttons.OK_CANCEL);
       // Exit loop when 'cancel' button is pressed.
       if ( element === 'cancel' ) { break; }
       // Check if the entered element is already in the array using indexOf().
       if( unique_elements.indexOf(element) > -1  ){
           // Alert the user if the element has already been added to the list.
           Browser.msgBox("Element " + element + " already added!");
       } else {
           // If the element is not already in the list add it to the end of the array
           // using the push() method
           unique_elements.push(element);
       }
     // Loop while 'element' is true, entering an empty string will cause the loop to terminate.
   } while ( element );
   
   // Display the elements as a string by using the array join
   Browser.msgBox(unique_elements.join(', '));
}
Code Notes
  • When invoked, the function “noDuplicatesArray()” repeatedly prompts for user entry and only exits when either the input dialog ‘cancel’ button is pressed or an empty string is entered (JavaScript treats empty strings as false!).
  • If an attempt is made to add an element already in the array, it is rejected and the MsgBox dialog is displayed to that effect.
  • If the newly entered element is not already in the array, it is added using the array push() method which adds the new element to the end of the array.
  • The key method is indexOf(), it returns -1 if its argument is not already in the array, lastIndexOf() could also have been used and would yield identical results.


Generating A Sub-Set Of An Input Array Using The filter() Method
// Apply a filter to the test array that uses an in-line function
// that is applied to each element returning true for elements passing
// the test, false otherwise.  Only elements for which the test is
// true are returned to the new array.
function test_filter() {
   // Create and populate an array object with test data.
   var array = ['rat', 'cat', 'cow', 'horse', 'dog', 'mouse', 'pig'];
   // Return only elements with less than four characters to the new array.
   var short_names = array.filter( function( element ) {
                                       return element.length < 4 ? true : false; } );
   Browser.msgBox(short_names);
}
Code Notes
  • Using one array as input to generate a new array, based on some programmer-defined rule, is a common task.  In a more realistic setting, the input array would be generated at run-time rather than pre-populated as it is here.  
  • One could use a for loop to process the input array and implement the filtering in the body of the loop.  However, the filter() method is specifically designed to do this.
  • The filter() method takes either a function object or an in-line function as used above, and returns a Boolean for each element.  Elements generating true are included in the return array.
  • The code above uses the JavaScript ternary operator to determine the Boolean return value.  An if else statement could also have been used.


Apply A Function To Each Element Of An Input Array Using The Array map() Method
// Apply an in-line function to each array element and add the return value to the return array.
function test_map() {
   // Create and populate an array object with test data.
   var array = ['rat', 'cat', 'cow', 'horse', 'dog', 'mouse', 'pig'];
   var upcase_names = array.map( function( element ) {
                                     return element.toUpperCase(); } );
   Browser.msgBox(upcase_names);       
}
Code Notes
  • The map() method, like the filter() method shown above, applies a function to each element of the input array.  However, unlike filter(), it returns an array of the same length as the input array.  
  • All or some of the elements of the input array may be altered in the output array.
  • The above example applies the string toUpperCase() method to each element.
  • map(), like filter(), is more useful when the input arrays are populated at run-time.


A More Practical Example - Filtering Duplicates From Spreadsheet Input
  • As stated above these array methods are more useful when the arrays that they operate on are generated at run-time.  
  • One practical application is to use them to filter out duplicates from an input Google spreadsheet range.
  • The writeUniqueList() below performs this filtering by taking an input range to filter and an output range to where it writes the unique list.
  • It is tested in the function test_writeUniqueList() where the input and output ranges are defined.
  • To use, add some data to a sheet, the test code below uses range "A1:A9", paste the code into the Script Editor and run the test function:


// Define range arguments for writeUniqueList() and call it with these arguments.
function test_writeUniqueList() {
   // Define an input range
   var input_range = SpreadsheetApp.getActiveSheet().getRange("A1:A9");
   // Define an output range as one column to the right of the input range
   // Warning, it will over-write
   var output_range =  input_range.getCell(1, 1).offset(0, 1);
   // Call the function with the correct arguments
   writeUniqueList(input_range, output_range);
}

// Write a unique list of values in an input range of a single column to an output range.
// Disallow input ranges with >1 column.
function writeUniqueList( range_input, copy_to_cell ) {
   // Check that the input range is single-column, otherwise throw an exception.
   if ( range_input.getNumColumns() > 1 ) {
       throw("Input range can only contain a single column");
   }
   // The range method getValues() always returns an array of arrays.
   // For single column input, the inner arrays are all single element.
   // Apply the map() method to turn the inner arrays into scalar values
   var values_input = range_input.getValues().map( function( inner_array )
                                                   { return inner_array.join(''); } );
   // Declare an array to which values are added
   var unique_values = [];
   // Declare a variable for the range row offset for output.
   var offset_row = 0;
   // Loop over the array of values from the single-column range input.
   for( var i = 0, values_input_count = values_input.length; i < values_input_count; i++ ) {
       // Perform array membership test.
       if ( unique_values.indexOf(values_input[i]) === -1 ) {
           // Add the element only if it is not already present.
           unique_values.push(values_input[i]);
           // Write the value to output if it has not been added to the array in a previous iteration.
           copy_to_cell.offset(offset_row,0).setValue(values_input[i]);
           // Increment the row offset counter for the next iteration.
           offset_row++;
       }
   }
   
}
Code Notes
  • The above code uses array methods to add some useful functionality, namely, produce a list of unique values from an input range.  There does not appear to be a way to perform this task using the Google spreadsheets interface.
  • To simplify matters, it throws an exception if the input range is not single-column.
  • Once again, the key method that is used to determine uniqueness is  indexOf().
  • Possibly the trickiest part of the code is the application of the map() method to the array of arrays returned by the range getValues() method.  It is important to remember that the array that getValues() returns is not a simple array of scalars, to turn it into an array of scalars, the map() method applies the array join() method with an empty string argument to each input element which turns a single element array into a scalar value.
  • When defining the output range in the calling function, the range method getCell() was used.  This is a one-based method, like the Cells property of the range object in Excel VBA.  getCell(1,1) returns a range object representing the top left cell of the given range object, just like Cells(1,1) in Excel VBA.