Did you know that Mata has a hidden command called numtobase26? Given its usefulness with putexcel, I can’t possibly see why it’s hidden! Perhaps Stata doesn’t want us using their highly-tuned statistical analysis software as a glorified spreadsheet beautifier. Nevertheless, you may be forced to automate a god-awful excel report in order to free up time to utilize Stata to the fullest!
This is where we start to depart from our beloved putexcel, and shift slightly towards xl(). Both commands tend to serve the same purpose, but much like Stata vs. Mata, putexcel is meant for the layman, and Mata is meant to make things fast while being slightly less readable. Gross oversimplification aside, putexcel favors the traditional spreadsheet alphabetical columns names combined with row numbers A1, A2, A3 whereas xl() generally takes the r1, c1 form. Say you have a dataset with 50 variables: v1, v2, v3, …, v50 and you wish to rename the variables so that they match the rows of excel. We could try and loop through the alphabet c(ALPHA), but once you get to Z, things start to get tricky. For me, a much simpler solution is to use numtobase26. For those unfamiliar with Mata, you can make a vector of sequential numbers with ease: mata: (1..50)
We have a row vector with 50 elements that we can now convert to our excel column names using:
mata: numtobase26((1..50)) A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX
We just need to store it as a local so we can use it easily in Stata. (Also concatenate the elements)
mata: st_local("alphavars", invtokens(numtobase26(1..50))) ren * (`alphavars'), dry You can imagine how this can greatly simplify creating putexcel functions that require you to reference columns by their excel name using a simple foreach var of varlist * combined with the levelsof technique described in Part I. Codebook
What happens if you want to go the other way? Say you have the column BC12 but you wish to convert this to 55, 12 so you can use it in one if xl()’s many functions. Unfortunately, I could not find an easy solution, so I made a rough fix that should be similar to how putexcel handles its cell expressions. Feel free to use the numofbase26() Mata function if you ever find yourself in this predicament.
real matrix numofbase26(string matrix base) { real matrix output, pwr, b real scalar i, j, k, l base = strupper(base) output = J(rows(base), cols(base), .) for (i=1; i<=rows(base); i++) { for (j=1; j<=cols(base); j++) { if (strlen(base[i,j]) == 1) output[i,j] = ascii(base[i,j]) - 64 else { l = strlen(base[i,j]) b = pwr = J(1, l, .) for (k=1; k<=l; k++) { b[1,k] = ascii(substr(base[i,j], k, 1)) - 64 pwr[1, k] = l - k } output[i,j] = rowsum(b :* (26:^pwr)) } } } return(output) } Happy Automating! I promise the next post will contain less technical stuff and more pictures.
3 Comments
tatan
6/17/2017 06:30:14 am
thanks
Reply
Sergiy
7/17/2024 01:03:03 pm
See https://ideas.repec.org/c/boc/bocode/s457865.html
Reply
Leave a Reply. |
AuthorWill Matsuoka is the creator of W=M/Stata - he likes creativity and simplicity, taking pictures of food, competition, and anything that can be analyzed. Archives
July 2016
Categories
All
|