


 Function that reports summary subject statistics of a given experiment
 such as gender count and ages.
 INPUT
   a cell array of 2 structs: session_info and subject_info, as returned 
   by ensemble_load_expinfo. This function will be called from within
   ensemble_load_expinfo, so there shouldn't normally be a need to call
   this function explicitly. However, in the event that we have data 
   from an old analysis that didn't include the output from this function, 
   it can be called easily by passing this information.
  
 OUTPUT
   An ensemble data struct with the following variables:
      nsubs -                    The number of subjects that were included in the analysis
      mean_age -                 The mean age of included subjects
      std_age  -                 The standard deviation of ages included in analysis
      age_range -                The minumum and maximum ages of included subs
      num_female -               The number of subjects that reported female gender
      num_male   -               The number of subjects that reported male gender
      num_gender_no_report -     The number of subjects that didn't report gender
      sessions_per_sub -         The number of sessions per subject. If the number of
                                 sessions for all subjects is not equal, the minumum 
                                 and maximum will be reported.
      mean_session_dur_minutes - The mean session duration in minutes. If
                                 there are multiple sessions per subject, 
                                 the mean for each session order
                                 will be reported (e.g. if 4 sessions/subject, 
                                 then 4 values are reported).
 Nov 3, 2009 - Stefan Tomic, First Version
 May 8, 2010 - S.T. fixed handling of anon_<hash> subIDs, which don't exist in
               subject table. Loops by subject ID retrieved in sessInfo
 May 22, 2010  PJ - added searching on type=session_info, and
               type=subject_info when search for these variables in name field
               fails. Fixed handling of age=0.

0001 function outData = ensemble_summary_subject_stats(inData) 0002 % 0003 % Function that reports summary subject statistics of a given experiment 0004 % such as gender count and ages. 0005 % 0006 % INPUT 0007 % a cell array of 2 structs: session_info and subject_info, as returned 0008 % by ensemble_load_expinfo. This function will be called from within 0009 % ensemble_load_expinfo, so there shouldn't normally be a need to call 0010 % this function explicitly. However, in the event that we have data 0011 % from an old analysis that didn't include the output from this function, 0012 % it can be called easily by passing this information. 0013 % 0014 % OUTPUT 0015 % An ensemble data struct with the following variables: 0016 % nsubs - The number of subjects that were included in the analysis 0017 % mean_age - The mean age of included subjects 0018 % std_age - The standard deviation of ages included in analysis 0019 % age_range - The minumum and maximum ages of included subs 0020 % num_female - The number of subjects that reported female gender 0021 % num_male - The number of subjects that reported male gender 0022 % num_gender_no_report - The number of subjects that didn't report gender 0023 % sessions_per_sub - The number of sessions per subject. If the number of 0024 % sessions for all subjects is not equal, the minumum 0025 % and maximum will be reported. 0026 % mean_session_dur_minutes - The mean session duration in minutes. If 0027 % there are multiple sessions per subject, 0028 % the mean for each session order 0029 % will be reported (e.g. if 4 sessions/subject, 0030 % then 4 values are reported). 0031 % 0032 % Nov 3, 2009 - Stefan Tomic, First Version 0033 % May 8, 2010 - S.T. fixed handling of anon_<hash> subIDs, which don't exist in 0034 % subject table. Loops by subject ID retrieved in sessInfo 0035 % May 22, 2010 PJ - added searching on type=session_info, and 0036 % type=subject_info when search for these variables in name field 0037 % fails. Fixed handling of age=0. 0038 0039 0040 fParams.name = 'session_info'; 0041 an_idx = ensemble_find_analysis_struct(inData,fParams); 0042 0043 % if searching on name field failed, try searching on type field 0044 if isempty(an_idx) 0045 fParams = struct('type','session_info'); 0046 an_idx = ensemble_find_analysis_struct(inData,fParams); 0047 end 0048 if isempty(an_idx) 0049 fprintf('Failed to find session_info\n') 0050 return 0051 end 0052 0053 sessInfo = inData{an_idx}; 0054 sessInfoCols = set_var_col_const(sessInfo.vars); 0055 0056 fParams.name = 'subject_info'; 0057 an_idx = ensemble_find_analysis_struct(inData,fParams); 0058 % if searching on name field failed, try searching on type field 0059 if isempty(an_idx) 0060 fParams = struct('type','subject_info'); 0061 an_idx = ensemble_find_analysis_struct(inData,fParams); 0062 end 0063 if isempty(an_idx) 0064 fprintf('Failed to find subject_info\n') 0065 return 0066 end 0067 subInfo = inData{an_idx}; 0068 subInfoCols = set_var_col_const(subInfo.vars); 0069 0070 subids = unique(sessInfo.data{sessInfoCols.subject_id}); 0071 nsubs = length(subids); 0072 nFemales = 0; 0073 nMales = 0; 0074 nGenderNoReport = 0; 0075 0076 for isub = 1:nsubs 0077 0078 thisSubID = subids{isub}; 0079 0080 [subInSubTable,subInfoIdx] = ismember(thisSubID,subInfo.data{subInfoCols.subject_id}); 0081 0082 if(subInSubTable) 0083 subDOB = subInfo.data{subInfoCols.dob}{subInfoIdx}; 0084 0085 if(~isnan(subDOB)) 0086 dobDatenum = datenum(subDOB,'yyyy-mm-dd'); 0087 else 0088 dobDatenum = NaN; 0089 end 0090 0091 subGender = subInfo.data{subInfoCols.gender}{subInfoIdx}; 0092 0093 switch(subGender) 0094 case 'F' 0095 nFemales = nFemales +1; 0096 case 'M' 0097 nMales = nMales+1; 0098 otherwise 0099 nGenderNoReport = nGenderNoReport + 1; 0100 end 0101 0102 else 0103 subDOB = NaN; 0104 dobDatenum = NaN; 0105 nGenderNoReport = nGenderNoReport + 1; 0106 end 0107 0108 sessionIdxs = strmatch(thisSubID,sessInfo.data{sessInfoCols.subject_id},'exact'); 0109 nSess(isub) = length(sessionIdxs); 0110 0111 sessDatenums = sessInfo.data{sessInfoCols.date_time}(sessionIdxs); 0112 sessEndDatenums = sessInfo.data{sessInfoCols.end_datetime}(sessionIdxs); 0113 0114 0115 %use the earliest session that this subject participated in to determine age 0116 useSessDatenum = min(sessDatenums); 0117 0118 %sort sessions for this sub by start_time to find session order 0119 [sortedSessDatenums,sortedIdxs] = sort(sessDatenums); 0120 sortedSessEndDatenums = sessEndDatenums(sortedIdxs); 0121 0122 nSessThisSub = length(sortedIdxs); 0123 for iSess = 1:nSessThisSub 0124 0125 serialSessDuration = sortedSessEndDatenums(iSess) - sortedSessDatenums(iSess); 0126 sessDurations(isub,iSess) = serialSessDuration * 24 * 60; 0127 0128 end 0129 0130 serialAge = useSessDatenum - dobDatenum; 0131 subAges(isub) = floor(serialAge/365); 0132 0133 % If the age is somehow set to zero, e.g. if subject accidentally entered 0134 % current day as birthday, enter NaN 0135 if subAges(isub) == 0 0136 subAges(isub) = NaN; 0137 end 0138 0139 end 0140 0141 %replace zeros in sessDurations with NaNs. These are sessions that were not 0142 %recorded for a subject (e.g. all subjects completed three sessions, except for 0143 %one subject that only completed two sessions). 0144 sessDurations(sessDurations == 0) = NaN; 0145 0146 %find mean duration per session 0147 nMaxSess = size(sessDurations,2); 0148 for iSess = 1:nMaxSess 0149 sessMeanDur(iSess) = nanmean(sessDurations(:,iSess)); 0150 end 0151 0152 meanAge = nanmean(subAges); 0153 stdAge = nanstd(subAges); 0154 ageRange = [nanmin(subAges) nanmax(subAges)]; 0155 0156 if(all(diff(nSess) == 0)) 0157 nSessPerSub = nSess(1); 0158 else 0159 nSessPerSub = [min(nSess) max(nSess)]; 0160 end 0161 0162 0163 outData = ensemble_init_data_struct; 0164 outData.name = 'summary_subject_stats'; 0165 outData.type = 'summary_stats'; 0166 outData.vars = {'nsubs','mean_age','std_age','age_range','num_female','num_male','num_gender_no_report' ... 0167 'sessions_per_sub' 'mean_session_dur_minutes'}; 0168 outDataCols = set_var_col_const(outData.vars); 0169 outData.data{outDataCols.nsubs} = nsubs; 0170 outData.data{outDataCols.mean_age} = meanAge; 0171 outData.data{outDataCols.std_age} = stdAge; 0172 outData.data{outDataCols.age_range} = ageRange; 0173 outData.data{outDataCols.num_female} = nFemales; 0174 outData.data{outDataCols.num_male} = nMales; 0175 outData.data{outDataCols.num_gender_no_report} = nGenderNoReport; 0176 outData.data{outDataCols.sessions_per_sub} = nSessPerSub; 0177 outData.data{outDataCols.mean_session_dur_minutes} = sessMeanDur;