William Matsuoka
  • Introduction
  • Topics
  • Stata
  • Teaching
    • ECON 641L
    • ECON 640L >
      • Econ-Data
  • Blog
  • About

W=M/Stata

Welcome

On Ad-Hoc Excel Equations

10/9/2015

2 Comments

 
Coercion.  It’s the topic of the day, and a short topic at that.  Coercion is the leading driver for Excel usage for most Stata users.  Workbook linkage is usually a pain – the practice of linking one cell from a workbook to a different cell in another workbook – and can turn a relatively simple task into a nightmare usually ending the job with a headache and a stiff drink.
​
However, with a few simple Mata commands, the crisscross of equal signs, dollar signs, and formulas can turn a repetitive task into a few simple copy-and-paste commands. There is also less room for error.  For fun, we'll do this entire thing in Mata.  First, we need to decide how many observations/variables we need.  Let’s look at our current workbook and target workbook.
Picture
BookA: Destination Cells
Picture
Source Cells
​So we need 5 columns and 7 rows, with columns from D to H and rows from 3 to 9 in our BookB set.  Write it:
mata:
st_addobs(7)
st_addvar("str255", "v" :+ strofreal(1..5)) 
​
​Starting with a blank dataset, add seven observations.  From there, we create 5 variables (v1 v2 v3 v4 v5) that are all of type str255.  This is a placeholder value for long equations.  The second part of this command “v” :+ strofreal(1..5) is my favorite part about Mata.  The colon operator (:+) is an element-wise operator that is super helpful.  No more Matrix conformability errors (well maybe a few).  It creates the matrix:
("v1", "v2", "v3", "v4", "v5")

​without much work and allows for any number of variables!  Sidenote – changing (1..5) to (1..500000) took about 3 seconds to create 500,000 v#’s.  Next, let’s set up our workbook name:
bookname = "'C:\Users\wmmatsuo\Desktop\[BookB.xlsx]Linked'!"

Specify the location of the workbook with the file name in square brackets and enclose the sheet name in apostrophes.  End that expression with an exclamation point!  Since we know we want D through H, and cells 3-9, we can write the following:
cols = numtobase26(4..8)
rows = strofreal(3::9)

numtobase26() turns 4 through 8 into D through H.  For more information see my previous post on its inverse.  The rows variable is a column vector containing values 3 through 9.  Now it’s time to build our expression:
expr = "=" :+ bookname :+ "$" :+ J(7, 1, cols) :+ "$" :+ rows

We put an equals sign in front of everything so that Excel will know we’re calling an equation.  Follow that with the workbook name.  If you didn’t know, dollar signs in excel will lock your formulas so they don’t move when you copy formulas from one cell to another.  We’ll put those in as well.

The next part, J(7, 1, cols), takes our column letters and essentially repeats it seven rows down.  Thanks to the colon operator, we just have to add another dollar sign to all elements of the matrix and our row numbers.  Since the new cols matrix contains seven rows, and our row vector contains seven rows, it knows to repeat the row values for all columns.
​​
Let’s just put that matrix back into Stata and compress our variables.
st_sstore(., ("v" :+ strofreal(1..5)), expr)
stata("compress")
end
Picture

And voilà, we have a set of equations ready to copy into Excel.  One quick tip: set your copy preferences so that when you copy the formulas, you won’t also copy variable names by going to edit->preferences->data editor-> uncheck “Include variable names on copy to clipboard”​.  Now just copy the equations directly from your Stata browse window into Excel, and enjoy those sweet results!
Picture

In other news, this post built upon the putexcel linking discoveries found over at belenchavez.com - who recently had her Fitbit stolen!  Follow her journey to get it back on her website or on Twitter - hopefully she'll be able to coerce the thief into returning the device.

The heart rate of my #fitbit thief pic.twitter.com/jM3V3g88V7

— Belen (@_belenchavez) October 10, 2015
2 Comments

Putexcel Part II: numofbase26()

9/18/2015

3 Comments

 
Did you know that Mata has a hidden command called numtobase26?  Given its usefulness with putexcel, I can’t possibly see why it’s hidden!  Perhaps Stata doesn’t want us using their highly-tuned statistical analysis software as a glorified spreadsheet beautifier.  Nevertheless, you may be forced to automate a god-awful excel report in order to free up time to utilize Stata to the fullest!

This is where we start to depart from our beloved putexcel, and shift slightly towards xl().  Both commands tend to serve the same purpose, but much like Stata vs. Mata, putexcel is meant for the layman, and Mata is meant to make things fast while being slightly less readable.  Gross oversimplification aside, putexcel favors the traditional spreadsheet alphabetical columns names combined with row numbers A1, A2, A3 whereas xl() generally takes the r1, c1 form.

Say you have a dataset with 50 variables: v1, v2, v3, …, v50 and you wish to rename the variables so that they match the rows of excel.  We could try and loop through the alphabet c(ALPHA), but once you get to Z, things start to get tricky.  For me, a much simpler solution is to use numtobase26.  For those unfamiliar with Mata, you can make a vector of sequential numbers with ease:
mata: (1..50)
We have a row vector with 50 elements that we can now convert to our excel column names using:
mata: numtobase26((1..50)) 
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX

We just need to store it as a local so we can use it easily in Stata.  (Also concatenate the elements)
mata: st_local("alphavars", invtokens(numtobase26(1..50)))
ren * (`alphavars'), dry

You can imagine how this can greatly simplify creating putexcel functions that require you to reference columns by their excel name using a simple foreach var of varlist * combined with the levelsof technique described in Part I.

Codebook

What happens if you want to go the other way?  Say you have the column BC12 but you wish to convert this to 55, 12 so you can use it in one if xl()’s many functions.  Unfortunately, I could not find an easy solution, so I made a rough fix that should be similar to how putexcel handles its cell expressions.  Feel free to use the numofbase26() Mata function if you ever find yourself in this predicament.
real matrix numofbase26(string matrix base)
{
    real matrix output, pwr, b
    real scalar i, j, k, l
    
    base = strupper(base)
    output = J(rows(base), cols(base), .)

    for (i=1; i<=rows(base); i++) {
        for (j=1; j<=cols(base); j++) {
            if (strlen(base[i,j]) == 1) output[i,j] = ascii(base[i,j]) - 64
            else {
                l = strlen(base[i,j])
                b = pwr = J(1, l, .)
                for (k=1; k<=l; k++) {
                    b[1,k] = ascii(substr(base[i,j], k, 1)) - 64
                    pwr[1, k] = l - k
                }
                output[i,j] = rowsum(b :* (26:^pwr))
            }
        }
    }
    return(output)
}

Happy Automating!  I promise the next post will contain less technical stuff and more pictures.

3 Comments

    Author

    Will Matsuoka is the creator of W=M/Stata - he likes creativity and simplicity, taking pictures of food, competition, and anything that can be analyzed.

    For more information about this site, check out the teaser above!

    Archives

    July 2016
    June 2016
    March 2016
    February 2016
    January 2016
    December 2015
    November 2015
    October 2015
    September 2015

    Categories

    All
    3ds Max
    Adobe
    API
    Base16
    Base2
    Base64
    Binary
    Bitmap
    Color
    Crawldir
    Email
    Encryption
    Excel
    Exif
    File
    Fileread
    Filewrite
    Fitbit
    Formulas
    Gcmap
    GIMP
    GIS
    Google
    History
    JavaScript
    Location
    Maps
    Mata
    Music
    NFL
    Numtobase26
    Parsing
    Pictures
    Plugins
    Privacy
    Putexcel
    Summary
    Taylor Swift
    Twitter
    Vbscript
    Work
    Xlsx
    XML

    RSS Feed

Proudly powered by Weebly
  • Introduction
  • Topics
  • Stata
  • Teaching
    • ECON 641L
    • ECON 640L >
      • Econ-Data
  • Blog
  • About