Internship; Stata; Latent Gold
Early this month I became a "science intern" at a company known for its leading role in providing personalization solutions (such as recommendation of products, TV programs, movie DVDs, music, etc.). So far most of my work involves developing models for recommending movies, estimating these models, finding the best one. Another task is to write small programs to automate (most) of the above processes so that we can use them for other modeling work.
Most of the analyses for model developments are done through
STATA. I have not used it before, but it is not difficult to learn. My colleagues gave me some online tutorials which are quite useful for beginners like myself. Also STATA has an on-line help system which is similar to the one in Matlab. With these documents and some examples I am able to read others' code and write my own. BTW, the only tricky thing for a beginner is to understand that in STATA "variable" and "macro", for some reason, are not defined in the same way as some languages that I am familiar with such as C/C++. In STATA, "variables" are kind of fields or columns in a table (or a relational database), while "macros" are in fact what are called variables in C/C++ , which are simply used as aliases or handles to refer to something else.
To see if the consumers can be classified into different market segments, we also try to estimate cohort-specific coefficients. For this purpose we use Latent Gold, a software package developed by
Statistical Innovations Inc., a tool for Latent Class and Finite Mixture Modeling.