Open Access Dataset and Toolbox of High-Density Surface Electromyogram Recordings 1.0.0

File: (9,398 bytes)
function [x_plus, A, b] = e_sifir_trn(Eamp, T, Param, Trim)
%E_SIFIR_TRN Identify (i.e., train) a non-linear FIR EMG-torque model.
%
% x_plus = e_sifir_trn(EMGamp, T, [Q D Tol ii], Trim)
%
% Uses regularized (via the pseudo-inverse technique) linear least squares
% [Press et al., 1994] to compute one or more sets of fit parameters of the
% non-linear FIR EMG-torque model specified by Koirala et al. [in review],
% including use of multiple training sets as well as extension to multiple
% inputs and outputs.  Can also be used for fitting of other linear dynamic
% models.  The model is of the form:
%  T(m+ii,co) = SUM_c_in(  SUM_d( SUM_q (
%           fit(q,d,ci,co) .* (Eamp(m-q,ci) .^ d)) )  ) + error,
% where:
%   m: (scalar integer) sample (time) index.  Not actually used within this
%      function, but useful in understanding the model.
%   ii: (scalar integer) number of samples into the future at which
%      the torque should be estimated.  Set ii=0 to estimate torque
%      at the current time.  Parameter ii cannot be negative.  Note that as
%      ii increases, the number of points available to the least squares
%      fit decreases.  See Koirala at al. [in review] for more details, but
%      EMG can be used to estimate torque into the future, with excellent
%      performance out to 60 ms and very good performance out to at least
%      100 ms.
%   ci: (scalar integer) input channel number.
%   Nci: (scalar integer) number of input channels.
%   co: (scalar integer) output channel number.
%   Nco: (scalar integer) number of output channels.
%   T(m) or T(m,co): If a vector, gives the output torque values
%      at times m for the one output.  A column vector is the
%      preferred convention.  If a matrix, each row (or column)
%      gives the torque values for each output (facilitating multiple-
%      output models).  In either case, the longer dimension is
%      taken as the time dimension; nonetheless, time indexed along
%      the column dimension is preferred.  Internally, always converted to
%      T(m,co) orientation and to a cell variable, which is used to
%      facilitate multiple recordings (see below).  If a cell array,
%      each element of T must have the same dimensions.  A cell array
%      format (see below) is used to facilitate multiple recordings.
%   D: (scalar integer) Max polynomial power.  D=1 for linear model.
%   Q: (scalar integer) Maximum time lag, in samples.  Q=0 ==> static
%      model.
%   Eamp(ci,m): Matrix of input EMGsigma estimates.  There is no distinction
%     needed to distinguish between extension- and flexion-oriented
%     Eamps.  Each row (or column) gives the estimate across all times.
%     The longer dimension is taken as the time dimension, the other
%     dimension is taken as the channel.  The length of the time
%     dimension must match the T time length.  Internally, always converted
%     to Eamp(ci,m) orientation and to a cell variable (see below).
%     Nonetheless, the preferred convention is for one channel per row
%     (time along the row dimension).  
%     Can also be a cell variable, which is used to facilitate multiple
%     recordings (see below).  If so, each cell element
%     must have the same dimensions.  T and Eamp should NOT have
%     already had start-up portions removed.
%
%  Tol: Tolerance of removal of singular values, based on the ratio of
%     the largest singular value of the design matix.
%  Param: A convenient vector to pack the values of [Q D Tol ii].
%  Trim: Two-element vector, giving the number of time samples to remove
%     from the beginning, Trim(1), and end, Trim(2), of Eamp and T
%     due to the various filter startup transients.  These samples are
%     removed prior to any fitting.  Note that the next Q values in T
%     are also unused, due to the inherent startup of the FIR filter being fit.
%     The first Q subsequent values in Eamp ARE used, since they are lag
%     times to the first-used sample time: m=(Q=1), assuming m=1 indexes the
%     first sample.  Trim values must be non-negative and cannot
%     leave the resulting Eamp and T lengths too small for proper fitting.
%
%  x_plus: Fit coefficients, listed in the normal format corresponding
%     to processing by the pseudo-inverse approach.  See e_sifir_x()
%     for a full description of the coefficient ordering.
%  A: Design matrix for the least squares fit.
%  b: Output matrix for the least squares fit.
%
% Multiple Training Recordings:
% Multiple training recordings can be combined for use in one parameter
% fit.  It is common to do so when recording trials are comprised
% of multiple components and/or multiple repetitions.
% A component refers to a trial that does not provide complete,
% stand-alone information for a parameter fit.  For example, if EMG is being
% related to two degrees of freedom, distinct recordings may have been
% made while exciting each degree of freedom, respectively.  Each of these two recordings
% has insufficient information alone to fit a complete model.  But, the recording
% pair forms a complete information set.  In this case, the pair of
% recordings must always be used together when training a model.
% A repetition refers to a repeated trial, generally with
% identical (or equivalent) recording conditions.  Any one of a set of repeated
% trials, or any combination of them, could be used to fit a model.  For this
% training routine, all supplied repetitions are used to fit the model.
%
% When utilizing multiple training recordings, inputs T and
% Eamp are supplied as cell arrays of the same dimensions.  The
% torque data from one cell is associated with the EMG amplitude data from
% the corresponding cell in Eamp.  Using cell indexing, these inputs
% are indexed as T{Cmp, Rep} and Eamp{Cmp, Rep}, where
% Cmp indexes the components and Rep indexes the
% repetitions.  Each element of T{} contains a vector/matrix as
% specified above, each of the same dimensions.  Each element must
% represent the same output channels.  Thus, if a recording only excites one
% degree of freedom, the user must still supply data for the unused
% degrees of freedom.  Presumably, null vectors would be supplied.
% Each element of Eamp{} contains a vector/matrix as specified
% above, each of the same dimensions.  Each element must represent
% the same input channels.
%
% Note that parameter fitting (as implemented in this function) does not
% distinguish between components and repetitions---all distinct trials
% are combined into one data set for fitting.  The testing function,
% e_sifir_tst, also does not distinguish between components and
% repetitions.  It simply tests all distinct trials.  However, cross-validation
% does distinguish between components and repetitions.
% Components are always grouped together, while repetitions are used
% to form the train-test combinations.
%
% EMG Amplitude Estimation Toolbox - Edward (Ted) A. Clancy - WPI

% Copyright (c) Edward A. Clancy, 2014.
% This work is licensed under the Aladdin free public license.
% For copying permissions see license.txt.
% email: ted@wpi.edu

% 30 December 2014.

%%%%%%%%%%%%%%%%%%%%%%%%%% Process Command Line %%%%%%%%%%%%%%%%%%%%%%%%%%
% A few overall checks.
if nargin~=4, error('Need exactly 4 input arguments.'); end
if iscell(Eamp) == 0, Eamp = {Eamp}; end  % Coerce to cell array.
if iscell(T)    == 0, T    = {T};    end  % Coerce to cell array.
if sum(size(T)==size(Eamp))~=2, error('T and Eamp dimensions differ.'); end

% Eamp and T elements.
for el=2:numel(Eamp)  % Are all elements the same size?
  if sum(size(Eamp{1})==size(Eamp{el}))~=2, error('{Eamp} sizes differ.'); end
  if sum(size(T{1})   ==size(T{el}))   ~=2, error('{T} sizes differ.');    end
end
if length(Eamp{1})~=length(T{1}), error('Eamp and T time durations differ.'); end
for el=1:numel(Eamp)  % Coerce each Eamp as Eamp(ci,m); T as T(m,co).
  if size(Eamp{el},1) > size(Eamp{el},2), Eamp{el} = Eamp{el}'; end
  if size(T{el},1)    < size(T{el},2),    T{el}    = T{el}';    end
end

% Parameters.
if size(Param,1)~=1 && size(Param,2~=1), error('Param not a vector.'); end
if length(Param)~=4, error('length(Param) ~= 4.'); end
Q = Param(1);  D = Param(2);  Tol = Param(3);  ii = Param(4);
if  Q<0, error('Q less than zero.');  end
if  D<1, error('D less than one.');   end
if ii<0, error('ii less than zero.'); end

% Trim.
if size(Trim,1)~=1 && size(Trim,2~=1), error('Trim not a vector.'); end
if length(Trim)~=2, error('length(Trim) ~= 2.'); end
if Trim(1)<0, error('Trim(1) must be non-negative.'); end
if Trim(2)<0, error('Trim(2) must be non-negative.'); end

%%%%%%%%%%%%%%%%%%%%%%%%%%%% Compute the fit. %%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Trim startup transients from the sequences.
for el=1:numel(Eamp), Eamp{el} = Eamp{el}(:,1+Trim(1):end-Trim(2)); end
for el=1:numel(T),    T{el}    = T{el}(1+Trim(1):end-Trim(2),:);    end

% Build the design matix A and output vector/matrix b.
[A, b] = e_sifir_ab(Eamp, T, Param);  % Build design matrix A, output b.

% Perform the least squares and compute the coefficients.
Aplus = pinv(A, Tol*norm(A));  % norm(A) ==> largest singular value of A.
x_plus = Aplus * b;  % Matrix multiply.  x_plus is (Q+1)*ci*D by co.

return