If you are working with data from the Human Connectome Project (HCP), perhaps these three small Octave/MATLAB utilities may be of some use:
- hcp2blocks.m: Takes the restricted file with information about kinship and zygosity and produces a multi-level exchangeability blocks file that can be used with PALM for permutation inference. It is fully described here.
- hcp2solar.m: Takes restricted and unrestricted files to produce a pedigree file that can be used with SOLAR for heritability and genome-wide association analyses.
- picktraits.m: Takes either restricted or unrestricted files, a list of traits and a list of subject IDs to produce tables with selected traits for the selected subjects. These can be used to, e.g., produce design matrices for subsequent analysis.
These functions need to parse relatively large CSV files, which is somewhat inefficient in MATLAB and Octave. Still, since these commands usually have to be executed only once for a particular analysis, a 1-2 minute wait seems acceptable.
If downloaded directly from the above links, remember also to download the prerequisites: strcsvread.m and strcsvwrite.m. Alternatively, clone the full repository from GitHub. The link is this. Other tools may be added in the future.
A fourth utility
For the HCP-S1200 release (March/2017), zygosity information is provided in the fields ZygositySR (self-reported zygosity) and ZygosityGT (zygosity determined by genetic methods for select subjects). If needed, these two fields can be merged into a new field named simply Zygosity. To do so, use a fourth utility, command mergezyg.
Dear Dr. Winkler, I have attempted to run mergezyg with the RESTRICTED.csv file the way your instructions are suggesting. I ended up getting the column header “Zygosity” appended to the last record (instead of to the first, i.e., header row) and I see no new merger column being created (it looks, as if the entire column was shifted down and truncated at the header).
Would you please help to figure out what to do to fix this?
Your busy time is very much appreciated.
Thanks for the message. It seems to be working for me, with a RESTRICTED.csv file downloaded on 16/April/2017. A column “Zygosity” is created at the end, containing a merger of ZygositySR and ZygosityGT. I note that there were changes in the publicly released HCP files between March and April, which changed some fields, and the script my not work with those. It should work with the final release, though.
At any rate, I have edited the hcp2blocks.m and hcp2solar.m so that the functionality provided by mergezyg is integrated into them, and therefore, at least for these functions, mergezyg is no longer needed.
I’ve just edited this page to reflect these changes.
Hope this helps!
All the best,
When I posted my previous message, I did use a download that was created in March. So, now I downloaded the latest RESTRICTED.csv, the latest hcp2blocks.m and ran hcp2blocks.m inside MATLAB (without prior use of mergezyg) as follows:
I get error message saying,
Index exceeds matrix dimensions.
Error in hcp2blocks (line 148)
U = unique(tab(:,2:3),’rows’);
Would you know how to overcome this roadblock?
It seems to be working for me, with the hcp2blocks.m that is available here (that is, the version on GitHub) and the most recent RESTRICTED file:
>> hcp2blocks(‘RESTRICTED_winkler_5_13_2017_22_3_55.csv’, ‘EB.csv’);
Warning: These subjects have data missing in the restricted file and will be removed:
I wonder if you have all the dependencies, including strcsvread.m? Can’t think of any other problem…
All the best,