iterate over a list of feature selection methods
Posted: 05 February 2010 02:49 PM   [ Ignore ]  
Newbie
Rank
Total Posts:  2
Joined  2010-02-05

Hi,

I would like to iterate over a list of feature selection methods in order to evaluate them. The following approach seems to work.

train some training set
% list of feature selection functions
featsels 
{featself([]nmc([]), 010), featselb([] ,nmc([]), 010)};
iterate over feature selection methods
for 1:2
    W 
featsels{i}(train);
    ...
end

Is this a correct method? It seems as if it does not do 10 fold CV. Also an error occurs when I use:

[W R] featsels{i}(train);

??? Error using ==> mapping.subsref
Too many output arguments
.

Error in ==> featselection at 103
            [W R] 
featsels{i}(train);

Could anyone help me out here? Thanks in advance!
Bastiaan

Profile
 
 
Posted: 05 February 2010 03:01 PM   [ Ignore ]   [ # 1 ]  
Administrator
Avatar
RankRankRankRank
Total Posts:  107
Joined  2008-04-26

Hi Bastiaan,

if you only want the trained feature selection mappings, you may simply multiply the dataset with the cell array of untrained selection mappings:

>> a=gendatd(100,10)
Difficult Dataset100 by 10 dataset with 2 classes[55  45]
>> f={featself([],nmc,0,10),featseli}



    
[0x0 mapping]    [0x0 mapping]

>> W=a*f



    
[10x4 mapping]    [10x10 mapping]

>> W{1}
Forward FeatSel
10 to 4 trained  mapping   --> featsel
>> W{2}
Individual FeatSel
10 to 10 trained  mapping   --> featsel

However, if you want to get out both the trained selection mapping and the matrix R describing the selection process, you will, as far as I know, need to invoke the feature selection function directly (and not use the untrained selection mapping):

>> a
Difficult Dataset
100 by 10 dataset with 2 classes[55  45]
>> [w,R]=featself(a,nmc,0,10)
Forward FeatSel10 to 4 trained  mapping   --> featsel

=

    
1.0000    0.7333    1.0000
    2.0000    0.7467   10.0000
    3.0000    0.7558    7.0000
    4.0000    0.7583    5.0000
    5.0000    0.7533    4.0000
    6.0000    0.7483    8.0000
    7.0000    0.7475    3.0000
    8.0000    0.7467    6.0000
    9.0000    0.7433    9.0000
   10.0000    0.7342    2.0000

The reason is that the overloaded * operator (mtimes) can only return one output.

Hope it helps,

Pavel

Profile
 
 
Posted: 05 February 2010 03:21 PM   [ Ignore ]   [ # 2 ]  
Newbie
Rank
Total Posts:  2
Joined  2010-02-05

Thanks, that works great.

But there is still one thing that confuses me. When I run the following:

train some training set
% list of feature selection functions
featsels 
{featself([]nmc([]), 010))};
iterate over feature selection methods
for 1:1
    W 
featsels{i}(train)
    
[test1 test2] featself(trainnmc([]), 010)
    ...
end

then the forward selection from the list (W = featsels{i}(train)) runs three times faster the the direct call ([test1 test2] = featself(train, nmc([]), 0, 10)). How is that possible? (I noticed that the direct call invokes the crossfold.m method, while the indirect method does not.)

Cheers,
Bastiaan

Profile
 
 
Posted: 18 February 2010 03:52 PM   [ Ignore ]   [ # 3 ]  
Moderator
RankRankRankRank
Total Posts:  108
Joined  2008-11-08

In the indirect method you should use the PRTools overload system:

W = train*featsel{i}

By W = featsel{i}(train) just train is substituted as a first parameter, but the others are default,
causing usage of the default leave-one-out NN criterion.

Bob Duin

Profile
 
 
   
 
 
‹‹ gendatw problem      overfitting problem ››