(Last updated: 7/23/2007)
We are using
Latent Gold (LG) version 4.5, with
Latent Gold Choice as an add-in. The official website only mentions LG v4.0, but new versions are still available from time to time. (And BTW, they have put demo versions and documents on
http://www.statisticalinnovations.com/products/)
The latest version has partly support batch mode. To use it, we need to run in a command line window the following command:
lg45.exe FILENAME.lgfLG intaller automatically creates a shortcut in the start menu for command line window, in which the path to LG directory is added to the environment (the %PATH% variable). The installer has a bug, however: It fails to detect where "cmd.exe" is located. It assumes it's under C:\Windows\, which works on most WinXP machines. Unfortunately, the default windows dir in Win2000 is C:\WINNT\, so the shortcut will not work until we manually fix it.
LG's model file use *.lgf as extention name. It is actually an ASCII text file. One can save the model file from the GUI version of LG (lg45win.exe). The structure of this model file is not documented ... yet. (For many software companies, the documentation work is always lagging behind the software development, isn't it?) So I manually save different versions of model files with slightly different options, and compare them to see the grammar of the lgf file. Most settings are quite straightforward. I'm not exactly sure about "
chnum", but it seems we can use the same list of independent variables as in "
chvar". "
outsect" is also a little bit complicated, but it should be just a matter of time, if we try every combinations, to find out all possible values and what they mean. Here's an example of something useful to us:
data='./0716b_rating_5.txt';
model:'0716b_rating_5' rating 5 /
toler=1e-008 tolem=0.01 tolran=1e-005 bayes=1 bayess2=1 bayeslat=1
bayespoi=1
iterem=250 iternr=50 itersv=50
iterboot=500
nseed=0 nseedboot=0
nrand=10
usemiss=No
sewald=yes dummy=no
outsect=0x1c37
outclstd outpred out='./0716b_rating_5.out';
chdes = 1;
dependent DEP_VAR;
replicate CASE_ID;
chvar INDEP_VAR1 INDEP_VAR2 iNDEP_VARn;
chnum INDEP_VAR1 INDEP_VAR2 iNDEP_VARn;
covariate COV_VAR1 COV_VAR2 COV_VARm;
attr DEP_VAR ordinal ;
attr COV_VAR1 ordinal ;
attr COV_VAR2 ordinal ;
attr COV_VARm ordinal ;
In the above example, we have a 5-class rating model (line 2) which uses data file specified in the first line. Besides alphabetical letters and numbers, the model name can have '-', '_', and perhaps some other characters -- provided that the name is quoted (either in single or double quotes). Line 3 to Line 12 specify options corresponding to those you will see in the "Technical", "Output", and "ClassPred" tab of a choice model dialogue. Line 13 I am not sure, but it seems leaving it as one is ok. Line 14 is the dependent variable; Line 15 is the case ID (same as in the "Variables" tab; used to tell observations of the same individual) and Line 15/16 is the independent variables (separated by space) It seems the GUI version will break these two lines into multiple lines to make sure each line has no more than 80 characters. But my tests have shown that this is not required. "
covariate" defines the set of covariates variables. The last few lines start with
"attr"; it tells LG whether the dependent variable and the covariates are nominal or ordinal.
The default output of LG batch mode includes a text file named as "*.lst", where "*" is the same as the one used in the model file "*.lgf". This file includes the output of most useful information, such as the estimated coefficients, some performance measures of the model (such as LL, AIC, BIC, BIC3), and the posterior probabilities of each class for each individual (if we select to output them). Another output file will be created if in the model file there is a
"out='xxxxx';" statement. It may include prior or posterior classification probabilities and individual coefficients, if such output is specified in the model file.
I use the following (one-line) sed statement to extract some useful information from the "*.lst" file. The output is redirect to some file. I can then open it in Excel, do some formatting work, and send them to my boss for discussions.
#!/bin/bash
sed -e '1,5p' -e '6,/^[^\t]\+\t\t/{s/^[^\t]\+\t\t//;p}' -e '/ = /,/AIC (based on LL)/{/ = /{x;p;p;x};/Log-prior/d;/Log-posterior/d;/^Chi-squared Statistics/,/^\t\{6,\}/d;p}' -e '/^Profile/{n;n;p}' -e '/^Parameters/,/^Importance/!d;/^Parameters/{x;p;x};/^Importance/d' $*