Excel…. Version control…. Git the hell out of here!

Version tracking Excel VBA code with git…. I’m not talking about uploading your spreadsheets to GitHub and calling it a day…. boring, we can do a bit better than that obviously!

I’m talking about extracting your actual VBA code from your spreadsheets and tracking changes to that code using git.

Well, that’s a pain you might say, extracting/exporting the code modules from Excel manually each time you make a change.

Well, I couldn’t agree more, it would be a major pain if you did it manually. But have you learned nothing from your time here regarding using code to automate the boring bits…. Read on.

Git and Excel using hooks

This is a subject I thought I would have done a post on well before now as I’ve been using this with all my Excel spreadsheets for quite some time to track changes to my VBA code. I guess better late than never as they say.

So how would one achieve this automation, the key is pre-commit hooks. Git hooks are a way of triggering some code to run when you commit a change using git or do other git related actions.

Every local git repository has a hidden .git directory, in this directory all the settings related to the repository are stored: –

If you delve into this directory you’ll find a sub-directory called hooks. If you open this up, you’ll find all the various hooks you can use. What we are interested in is using in this instance is the pre-commit hook.

When you make a change to a spreadsheet and commit these changes to a repository, this pre-commit hook runs code first that extracts the VBA code modules and adds these to the commit before then actually doing the commit proper like.

To get the VBA script working you’ll need to enable programmatic access to the VBA project within Excel. You can do this by going -> File -> Options -> Trust Center -> Trust Center Settings -> Macro Settings -> activate checkmark on Trust access to the VBA project object model

The code…

So how does this work, you simply replace the default example pre-commit.sample file with the following pre-commit file: –

#!/bin/sh

# PRE-COMMIT file

# last updated 16/2/2019

# Python scripts to extract VBA modules and XML files from Excel files and remove any orphaned directories (where an excel file has been deleted from a repository)
python .git/hooks/pre-commit-VBA.py
python .git/hooks/pre-commit-XML.py

# Update the commit to include any new/deleted files resulting from the above scripts.
# This will add any new *.VBA & *.XML directories to the commit that were created by the above scripts,
# and will remove any orphaned *.VBA & *.XML directories to the commit that were removed by the above scripts.
# These commands will only add tracked files (i.e. those not excluded by a .gitignore file)

# this updates the commit for any files removed by the scripts within the tracked files/directories
git add --update

# this updates the commit for any files added by the scripts within the tracked files/directories
git add -- ./*.VBA || :
git add -- ./*.XML || :      

This then runs a python file pre-commit-VBA.py containing the following code which opens up (and closes once finished) any excel files in the root of the repository (make sure you control this by using a .gitignore file, see example which excludes all directories but the root directory within the GitHub repository), exports the code into subdirectories named with the filename and .VBA suffix.

'''
pre-commit-VBA.py

Last updated 30/07/2020

Script extracts VBA modules, forms and class modules into a new 'FILENAME.VBA' subdirectory within the repository root folder (by default)

With the standard .gitignore entries, only Excel files located within the root directory of the repository will be processed & added to a commit
'''

import os
import shutil
import win32com.client
import time
import traceback


def init_Xl(file_path):
    '''function to open Excel and open Workbook at file_path

    :param file_path: full path of Workbook to open/process
    :type file_path: str
    :return: Xl_app ('Microsoft Excel'), Excel Workbook (<COMObject Open>)
    :rtype: str
    '''
    # open Excel as hidden instance in the background
    Xl_app = win32com.client.DispatchEx('Excel.Application')
    # required to disable all macros in all files opened programmatically without showing any security alerts. Equivalent of Application.AutomationSecurity = msoAutomationSecurityForceDisable
    Xl_app.AutomationSecurity = 3
    # required to suppress the overwriting of existing file
    Xl_app.DisplayAlerts = False
    # required to suppress any application events from running on opening excel files
    Xl_app.EnableEvents = False
    # open Workbook
    Xl_wb = Xl_app.Workbooks.Open(Filename=file_path, Editable=True)
    print('Starting Excel instance...')
    return Xl_app, Xl_wb


def close_Xl(Xl_app, Xl_wb):  # , file_path, Xl_file_format):
    '''function to save & close Workbook instance

    :param Xl_app: 'Microsoft Excel'
    :type Xl_app: str
    :param Xl_wb: Microsoft Excel Workbook COMObject
    :type Xl_wb: str
    '''
    print('Closing Excel instance...')
    Xl_app.Workbooks(Xl_wb.Name).Close(SaveChanges=False)


def module_file_ext(Xl_module_type):
    '''function to return the standard extensions based on module type numeric reference

    :param Xl_module_type: Module type codes
        * Type = 1   = Standard Module    -- (*.bas)
        * Type = 2   = Class Module       -- (*.cls)
        * Type = 3   = Userform Module    -- (*.frm)
        * Type = 100 = Sheet Class Module -- (*.cls)
    :type Xl_module_type: int
    :return: standard file extension based on module type
    :rtype: str
    '''
    module_type_dict = {1: '.bas', 2: '.cls', 3: '.frm', 100: '.cls'}
    return module_type_dict[Xl_module_type]


# function to extract the VBA modules from an Excel archive
def extract_VBA_files(workbook_name):

    # Find index of standard '- Rev X.xxx' tag in workbook filename if it exists
    index_rev = workbook_name.find(Rev_tag)

    # If revision tag existed in the filename, remove it based on index position, else remove only the file extension
    if index_rev != -1:
        filename = workbook_name[0:index_rev]
    else:
        filename = os.path.splitext(workbook_name)[0]

    # Setup directory name based on filename
    vba_path = directory + filename + VBA_suffix

    # Make new directory (to replace existing where it previously existed and was removed)
    os.mkdir(vba_path)

    # print to console for information/debugging purposes
    max_len = max(len('Directory name -- ' + vba_path), len('Workbook name  -- ' + workbook_name))
    print()
    print('-' * max_len)
    print('Workbook name  -- ' + workbook_name)
    print('Directory name -- ' + vba_path)
    print('-' * max_len)

    # Open Excel instance and open workbook
    Xl_app, Xl_wb = init_Xl(directory + workbook_name)

    # Iterate through each VBComponent (Sheet Class Modules, Class Modules, Standard Modules, Userforms)
    try:
        for Xl_module in Xl_wb.VBProject.VBComponents:
            # get name of module
            Xl_module_name = Xl_module.Name
            # get module type
            Xl_module_type = Xl_module.Type

            module_name = f'{Xl_module_name}{module_file_ext(Xl_module_type)}'  # filename e.g. 'module.bas'
            module_path = f'{vba_path}\\{module_name}'

            # for standard modules, class modules and userforms
            if Xl_module_type in [1, 2, 3]:
                # export module
                Xl_wb.VBProject.VBComponents(Xl_module_name).Export(module_path)
                print(f'Exporting -- {module_name}')

            # for sheet class modules
            elif Xl_module_type in [100]:
                # export module
                Xl_wb.VBProject.VBComponents(Xl_module_name).Export(module_path)
                print(f'Exporting -- {module_name}')

    except Exception as ex:
        print(ex)

    finally:
        # close Excel instance
        close_Xl(Xl_app, Xl_wb)


try:
    # list excel extensions that will be processed and have VBA modules extracted
    excel_file_extensions = ('xlsb', 'xls', 'xlsx', 'xlsm', 'xla', 'xlt', 'xlam', 'xltm')

    # process Excel files in root directory (not recursive to subdirectories)
    directory = os.getcwd() + '//'

    # String to append to end of subdirectory
    VBA_suffix = '.VBA'

    # String to remove from end of filename
    Rev_tag = ' - Rev '

    # remove all previous '.VBA' directories including contents
    for directories in os.listdir(directory):
        if directories.endswith(VBA_suffix):
            for x in range(0, 5):  # try 5 times
                try:
                    shutil.rmtree(directories)
                    err = False
                except Exception:
                    err = True
                    with open("precommit_exceptions_log.log", "a") as logfile:
                        traceback.print_exc(file=logfile)
                    pass

                if err:
                    time.sleep(2)  # wait for 2 seconds before trying to remove directory again
                else:
                    break

    # loop through files in given directory and process those that are Excel files (excluding temporary Excel files ~$*)
    for filenames in os.listdir(directory):
        if filenames.endswith(excel_file_extensions):
            # skip temporary excel files ~$*.*, otherwise process Excel file and extract VBA modules
            if not filenames.startswith('~$'):
                # extract VBA modules
                extract_VBA_files(filenames)

    # print trailing line to separate output from multiple files
    print()

    # loop through directories & remove '.VBA' directory if nothing was extracted and directory is empty
    for directories in os.listdir(directory):
        if directories.endswith(VBA_suffix):
            if not os.listdir(directories):
                for x in range(0, 5):  # try 5 times
                    try:
                        os.rmdir(directories)
                        err = False
                    except Exception:
                        err = True
                        with open("precommit_exceptions_log.log", "a") as logfile:
                            traceback.print_exc(file=logfile)
                        pass

                    if err:
                        time.sleep(2)  # wait for 2 seconds before trying to remove directory again
                    else:
                        break

except Exception:
    with open("precommit_exceptions_log.log", "a") as logfile:
        traceback.print_exc(file=logfile)
    raise
    

By limiting it to just Excel files in the root of the repository you can ignore any other files in subdirectories, so you only commit what you want vs everything. But you’re free to do whatever you like. You can also run the above manually and it will provide some feedback on progress as well within the command line output.

As a bonus the following python code exports any customui.xml files you might have in your Excel spreadsheet (to interact with the Excel ribbon) using the following code: –

'''
pre-commit-XML.py

Last updated 23/02/2019

Script extracts customUI.xml/customUI14.xml files into a new 'FILENAME.XML' subdirectory within the repository root folder (by default)

With the standard .gitignore entries, only Excel files located within the root directory of the repository will be processed & added to a commit
'''

import os
import shutil

from zipfile import ZipFile

# list excel extensions that will be processed and have VBA modules extracted
excel_file_extensions = ('xlsb', 'xlsx', 'xlsm', 'xlam', 'xltm')

# process Excel files in this directory (not recursive to subdirectories)
directory = '.'

# String to append to end of subdirectory based on filename
XML_suffix = '.XML'

# String to remove everything after in Excel filename
Rev_tag = ' - Rev '


# function to extract the XML files from an Excel archive
def extract_XML_files(archive_name, full_item_name, extract_folder):

    with ZipFile(archive_name) as zf:
        file_data = zf.read(full_item_name)
    with open(os.path.join(extract_folder, os.path.basename(full_item_name)), "wb") as file_out:
        file_out.write(file_data)


# remove all previous '.XML' directories including contents
for directories in os.listdir(directory):
    if directories.endswith(XML_suffix):
        shutil.rmtree(directories)

# loop through files in given directory and process those that are Excel files (excluding temporary Excel files ~$*)
for filename in os.listdir(directory):
    if filename.endswith(excel_file_extensions):

        # skip temporary excel files ~$*.*
        if not filename.startswith('~$'):

            # Setup variable for wookbook name
            workbook_name = filename

            # Find index of standard '- Rev X.xxx' tag in workbook filename if it exists
            index_rev = filename.find(Rev_tag)

            # If revision tag existed in the filename, remove it based on index position, else remove only the file extension
            if index_rev != -1:
                filename = filename[0:index_rev]
            else:
                filename = os.path.splitext(filename)[0]

                # Setup directory name based on filename
            xml_path = filename + XML_suffix

            # Make new directory (to replace existing where it previously existed)
            os.mkdir(xml_path)

            # print to console for information/debugging purposes
            max_len = max(len('Directory name -- ' + xml_path), len('Workbook name  -- ' + workbook_name))
            print()
            print('-' * max_len)
            print('Workbook name  -- ' + workbook_name)
            print('Directory name -- ' + xml_path)
            print('-' * max_len)

            # Extract customUI.xml & customUI14.xml files from temporary zip file if they exist
            try:
                extract_XML_files(workbook_name, 'customUI/customUI.xml', xml_path)
                print('Extracted -- customUI.xml from ' + workbook_name)
            except KeyError:
                print('customUI.xml does not exists in ' + workbook_name)
            try:
                extract_XML_files(workbook_name, 'customUI/customUI14.xml', xml_path)
                print('Extracted -- customUI14.xml from ' + workbook_name)
            except KeyError:
                print('customUI14.xml does not exists in ' + workbook_name)

# print trailing line to separate output
print()

# loop through directories & remove '.XML' directory if nothing was extracted and it is therefore empty
for directories in os.listdir(directory):
    if directories.endswith(XML_suffix):
        if not os.listdir(directories):
            os.rmdir(directories)

All this opening and closing of Excel instances is hidden from the user to prevent any inadvertent user interaction, this is done by line 27 in the pre-commit-VBA.py file. To be fair it took a while to work that one out and avoid opening visible Excel instances popping up all over the place…

Finally, we remove any files from the commit that no longer exist and we add all the newly extracted code modules, userforms, etc to the git commit.

Setup on your repositories…

So how do we get all this working on your Excel repository, simply copy the pre-commit, pre-commit-VBA.py and pre-commit-XML.py files from this GitHub repository into your repository’s hooks directory. Thats all apart from setting up an appropriate set of rules in your .gitignore file and you’re good to go.

Then just commit and push like normal. Except this time, you’ll end up with a few additional directories with any code modules in addition to your Excel file!

Check out some additional commentary in the readme of the GitHub repository regarding my approach to managing Excel files.

I like to add ‘ – Rev 1.xxx‘ at the end of the Excel filename to identify the revision immediately. The code detects a common suffix and removes it, so the code modules are always extracted into subdirectories with the filename of the file without this suffix. You can of course amend this to your own liking.

Extraction directories

Disclaimer

Sometimes Windows (being Windows…) throws up a permissions error that prevents the extraction of the code modules. This seems random to me, and only occurs once in a blue moon. In an attempt to get around this, if it fails, the code attempts the process again a max of 5 times. This is usually enough to allow Windows to release whatever hold it had over your file(s) and allow the code to export the VBA modules.

Sometimes Windows is a dick, and the extraction still fails after multiple attempts. If this rare occasion occurs to you, you can remove the latest commit before you push it to the remote repository. Then redo your commit, and 9999 times out of 9999 times it will work.

If this occurs a precommit_exceptions_log.log file will be added to the commit stating the error. So, keep an eye out for it, if you push to your remote repository the commit may be missing some code modules and you’ll remove the remote version of this module (stuffing up your versioning a little).

You can delete the last local commit (don’t push it) by opening a command window in your local repository and entering the following command: –

git reset --soft HEAD~1

Then do the commit again and you’re good to go.

Conclusion

$Profit$ … what other conclusion could there be here! Harnessing the power of Git with Excel!

Any questions either post a comment here or raise an issue over on GitHub

Leave a Reply