Dimensions (PASW Data Collection) 5.6 Table Scripting and Data Management
Cobalt Sky were asked to get involved in beta testing for the new version of Dimensions. During this time we were also asked if we would like to present at the SPSS Directions 2009 conference in Prague. Not being one to forsake a free meal (or two) I accepted.
I started off meeting up with Nick Read, SPSS’s Dimensions Product Manager. We started off by addressing the main issues of the software - these turned out to be performance and the functionality of grids.
Performance
We did a few time tests between versions 4.5 and 5.6. Unfortunately there was no improvement in speed. In time tests between TOM and Quantum, on a small job, with say 100 variables and 600 respondents, both TOM and Quantum ran in less than 10 seconds. When the number of variables was increased to 600 (still with 600 respondents) Quantum still ran in less than 10 seconds, whereas TOM took 5 minutes! Help! I am told that SPSS are looking into this.
Before we go any further I want to make it clear that on any small to medium size job I would use TOM as my preference. The problems may arise for large companies who decide that they need to remove Quantum completely from their system; this would imply that their largest continuous study would have to be migrated to TOM for this strategy to be fulfilled. This is still a grey area. SPSS need to work on performance and then this may become clearer.
Grid/Variable Scripting Improvements in mrScriptBasic Table Scripting
Back to grids - now this is something that SPSS have addressed. The new functionality and improved accessibility to grids is a success. Now we can create grids at the table scripting stage in an mrScriptBasic (.mrs) script. Previously we had to write code in a Data Management Script (.dms) and also add in the metadata by hand. Now the improved code means that we can use:
TableDoc.DataSet.Variables.AddNewGrid("Hotel,Food,City","MyGrid","Sat")
This takes three individual variables (Hotel, Food and City) that share the same category list (Sat) and makes a grid out of them. This in effect, does the recode and adds the metadata all in this line.
We can also add regular (non grid) variables in .mrs scripts using:
TableDoc.DataSet.Variables.AddNewVariable("!
NHotel categorical [1..1] expression("MyGrid[{Hotel}].Sat");
!")
The above example references a grid slice. This also shows that grid slices can be referenced more easily in 5.6. This affects regular grid slices and also slices of hierarchical loops. Therefore we can now write:
Person[1].Gender = {Male}
rather than:
SUM(Person.(LevelID=1 AND Gender={Male}))
In 5.6 Grid iterations can now be manipulated (as well as fields within grids which was possible before 5.6).
Overall, the flexibility in grid/variable scripting has been improved markedly in this version:
- Derive grids more easily at table scripting stage
- New variables derived more easily
- Grid iterations can be manipulated
- Grid slices referenced directly
- As before 5.6 can save .mdd
- Can use .mdd to export derived case data
Nice one SPSS!
Grid Scripting Improvements in Data Management Scripting
Some new functions have been added to aid manipulation of grids. Much of this is superseded by the new .mrs functionality, but you may find the .dms functions useful if you are only exporting data and not running tables.
- CopyGrid function
- Copy all/part of grid into new grid
- Use in OnNextCase section
- Need to create metadata
- Uses - Manage large grid - set into smaller grid...
This is the code that would be used in the OnNextCase section:
CopyGrid(MyGrid, NewGrid, {Food,City})
This adds a new grid called ‘NewGrid’ and takes the Food and City iterations only from a pre-existing grid ‘MyGrid’. However, you do have to manually add the metadata in for this, so it’s not completely hassle free!
- CreateGridSummary function
- Summarise categories
- Use in OnNextCase section
- Need to create metadata
- Uses?
- Ideal for Top 2 summary table
Usage:
CreateGridSummary(Grid, SumVar, Categories)
Example:
CreateGridSummary(MyGrid, Top2, {Very_Satisfied,Satisfied})
Note:
Improved table scripting makes this now easier via .mrs
- FlattenGrid function
- Creates individual variables at top level from grid slices
- Use in OnBeforeJobStart section
- Uses?
- Use flattened variables as filters
Usage:
FlattenGrid(Grid,Iteration,MDMObject)
Example:
FlattenGrid("MyGrid.Sat",Null,MDoc)
Note:
Improved table scripting makes this now easier via .mrs.
- FlipGrid function
- Designed to flip grid as iterations could not be manipulated prior to 5.6
Usage:
FlipGrid(SourceGrid,DestinationGrid)
Note:
Grid iterations can now be manipulated via .mrs.
In Summary of .dms Grid Functions:
- Create persisted variables - real case data
- Save time as code is shorter than before and re-compiled, so therefore more efficient
- Economise on .dms code
- BUT must set up metadata in .dms
- A lot of this can be done via .mrs in 5.6
- May be more useful if not doing tabs, but exporting data
Other functions added
Decode:
- A ‘fetch’ solution
- Similar to many ‘if’ statements/Select Case
- Significance letters
- Ability to set / test specific letters/columns
- WHERE clause
- Can be used on HDATA for DDF CDSC
- FilterBy
- Same as Intersection function or ‘*’
Conclusion
So, should people use 5.6? The answer is a definite ‘yes!’. Any existing TOM users should upgrade ASAP. Any non Dimensions (Data Collection) users should take a look at it if possible. TOM will take over from Quantum one day, I am convinced of that - it just depends when SPSS address the issue of speed.
Richard Coffey is a Senior DP Consultant at Cobalt Sky