Dynamic variable names for full workspace operations
19 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
D. Plotnick
am 1 Feb. 2017
Kommentiert: per isakson
am 27 Feb. 2017
To start with, I understand dynamic variable names are bad. I am not really trying to use them. What I really want to do is apply a specific operation to all variables in the current workspace; this way I can generate a generic function to apply that operation.
Two examples: Example 1
Let's say that I have a code where I can use double or single precision depending on user choice. I want to cycle through all of the workspace variables looking for e.g. doubles that have numel>1000 and convert all of them to single. I can use who to get my workspace, and then a loop with isa and a boolean to find all the variables that match those criteria. What I want to do now is perform the operation varname = single(varname) to reassign those variables to the single-precision class while keeping the same name. Is there a way to do this other than using dynamic variable names?
Example 2
Lets say I ran into an "out-of-memory" error on the GPU because there is a bunch of junk left on there from other operations. I want to cycle through all gpuArray class variables and pull them down using varname = gather(varname), perform a reset(gpuDevice), and then possibly place them back on the gpu using varname = gpuArray(varname). Again, I understand that I could write a code that knows all of the variable names, the point here is to generate a generic code that can do the operation on all the correct workplace variables.
Again, if there is a totally obvious way of doing this that doesn't involve dynamic names, please let me know. Also, if there is something super bad about either of these concepts, I need to know that too.
Otherwise...how do you code something like this using dynamic variable names, since Matlab seems to make that kind of operation intentionally difficult.
Thanks for your help, -Dan
4 Kommentare
James Tursa
am 4 Feb. 2017
Bearbeitet: James Tursa
am 4 Feb. 2017
For re-classing variables from double to single, I will mention that this will have issues if any of the variables are shared copies of other variables. A loop that does varname = single(varname) will effectively unshare the variables which would have negative memory usage consequences. E.g., a simplistic example:
X = a 1GB double array
Y = X; % a shared data copy of X.
At this point you only have 1GB of data in memory, since both X and Y are sharing data. Now see what happens when you make each of them single class:
X = single(X): % X is unshared with Y and turned into single
Y = single(Y); % Y is turned into single
After the 1st statement, total data memory is 1.5GB. After the 2nd statement, total data memory is back down to 1GB. But X and Y are not shared copies of each other any more, so there is no memory sharing benefit as was the case in the beginning. Ideally, if one knows that X and Y are shared, you would like to do something like this instead to get the total data memory down to 0.5GB:
X = single(X);
Y = X;
But there are no official mechanisms for detecting variable sharing status, either at the m-file level or in a mex routine. The only way to detect sharing is to hack into the variables in a mex routine, and even then it can get messy very quickly if there are cell arrays, struct arrays, or classdef objects involved.
Bottom line is that if there is a significant amount of data sharing involved, re-classing variables serially will wipe that sharing out and have negative memory usage consequences.
Akzeptierte Antwort
per isakson
am 3 Feb. 2017
Bearbeitet: per isakson
am 8 Feb. 2017
There are good reasons to avoid eval (Here, I use eval as shorthand for eval, evalin and assignin), see
- TUTORIAL: Why Variables Should Not Be Named Dynamically (eval) by Stephen Cobeldick
- Alternatives to the eval Function
"Example 1"   I don't think there is a solution without eval. But after all, eval exists in several languages and that's for a reason - I assume.
Here is my attempt to answer "Example 1".
M1 = ones(2e4)+eps;
M2 = ones(1e4)+eps;
variables = reshape( whos('M*'), 1,[] );
for v = variables
convert( v.name, 'single' )
end
whos('M*')
prints
Name Size Bytes Class Attributes
M1 20000x20000 1600000000 single
M2 10000x10000 400000000 single
where
function convert( variable_name, new_class )
% convert variable, variable_name, to type, new_class, in the workspace of the caller
%
% assert that the value of variable_name is the name of a variable in the caller
xpr = sprintf( 'exist( ''%s'', ''var'' );', variable_name );
num = evalin( 'caller', xpr );
%
if num == 1
% str = sprintf( '%1$s = cast( %1$s, ''%2$s'' );', variable_name, new_class );
% sts = evalin( 'caller', str );
% Error: The expression to the left of the equals sign
% is not a valid target for an assignment.
xpr = sprintf( 'cast( %s, ''%s'' );', variable_name, new_class );
try
assignin( 'caller', variable_name, evalin( 'caller', xpr ) );
catch me
fprintf( 2, 'Error: ''%s''\n', me.message );
end
else
fprintf( 2, 'Undefined variable, ''%s''\n', variable_name );
end
end
Stephen Cobeldick presents the following list of problems related to eval. I argue that my above use of eval avoids most of these problems.
- Slow   the conversion in the above code is as fast as   M1=cast(M1,'single'); M2=cast(M2,'single');
- Buggy   No, not in this case. convert does one thing and it's possible to test it thoroughly.
- Security Risk   Not in this case. All necessary tests may be done in convert.
- Difficult to Work With   The use of convert should not cause any problems.
- Obfuscated Code Intent   convert communicates the intent well enough.
- Confuses Data with Code   Not applicable in this case.
- Code Helper Tools do not Work   That's true in this case, but F1 works with convert.
 
ADDENDUM, 2017-02-08
An improved version of convert inspired by the comments by Jan Simon
function convert( variable_name, new_type )
% convert variable, variable_name, to type, new_type, in the workspace of the caller
narginchk( 2, 2 )
assert( isa( variable_name, 'char' ), 'convert:IllegalClass'...
, '"%s" is not a character array', value2short(variable_name) )
assert( isrow( variable_name ), 'convert:IllegalSize' ...
, '"%s" is not a row', value2short(variable_name) )
assert( isvarname( variable_name ), 'convert:IllegalName' ...
, '"%s" is not a valid variable name', variable_name )
assert( isa( new_type, 'char' ), 'convert:IllegalClass' ...
, 'The type of new_type, %s, is not a char', value2short(new_type) )
assert( isrow( new_type ), 'convert:IllegalSize' ...
, 'The value of new_type, %s, is not a row', value2short(new_type) )
type_list = {'int8','uint8','int16','uint16','int32','uint32' ...
,'int64','uint64','double','single','logical','char'};
assert( any(strcmp( new_type, type_list )), 'convert:IllegalType' ...
, 'The value of new_type, %s, is not a valid type name', new_type )
% assert that the value of variable_name is the name of a variable in the caller
xpr = sprintf( 'exist(''%s'', ''var'' );', variable_name );
assert( evalin('caller',xpr) == 1, 'convert:UndefinedVariable' ...
, '"%s" is not a defined variable', variable_name )
cmd = sprintf( 'builtin( ''cast'', %s, ''%s'' );', variable_name, new_type );
try
assignin( 'caller', variable_name, evalin( 'caller', cmd ) );
catch me
fprintf( 2, 'Error: "%s"\n', me.message );
end
end
where
function str = value2short( val )
% value2short converts value to a short string that is suitable to display
%
% See also: mat2str
%
if nargin > 0
str = workspacefunc( 'getshortvalue', val );
max_len = 48;
if length( str ) >= max_len
str = [ str(1:max_len-4 ), ' ...' ];
end
else
str = 'NIL';
end
end
12 Kommentare
Stephen23
am 18 Feb. 2017
Bearbeitet: Stephen23
am 19 Feb. 2017
@per isackson: there is no trick. It was simply me stating what I would do if I was writing a function where at some unknown point during the calculation I needed to change the class of some variables. Here are a few starting assumptions:
- The variables are known. For me this is a perfectly reasonable assumption as I never have unknown variables in my workspace (never use load directly into the workspace, avoid assignin, eval, or other dynamic variable names).
- There are only a few variables. Again for me quite reasonable, because I do not fill my workspace with thousands of variables: that is what arrays are for.
- The variables are accessible to the "change" function.
I do not claim that this will change many arbitrary, not previously specified variables, because I never have unknown variables in my workspace anyway (as we all know, that path leads to JIT problems, obfuscation, and hard to fix bugs). It does not happen in my code, therefore I do not need to solve that problem. I prefer to solve tasks through good design, rather than trying to patch them up later (and hence this nested function).
So in the end my code would have (by design) no unknown variables, and if there were more than a few values, have them stored in some array, giving:
function out = test(N) % try around 12
%
Z = 0;
for k = 1:N
work()
end
%
function work()
Z = Z+1; % my work
% change can be triggered anywhere:
if rand()>0.8
change()
end
end
%
function change()
if Z>10 % condition
Z = single(Z);
end
end
%
out = class(Z);
end
Note that change can be called by any other nested or local functions, callbacks, timers, listeners, etc., at any point during the calculation.
I do not claim that this answers the original question of "cycle through all of the workspace variables": for the reasons I have given that problem would never occur in my code, allowing me to use this simple nested function to simply resolve the task of converting at any arbitrary moment during calculations involving my known variables.
Rather than trying to sledgehammer my way through my workspace, instead I asked myself: what am I trying to achieve, and found an elegant solution for that.
per isakson
am 27 Feb. 2017
@Stephen Cobeldick, Thank you for your answer. I agree fully regarding "good design" and "no unknown variables".
I assumed as a premise that OP had painted himself into a corner. After reading the question more carefully I realize that OP posed the question out of curiosity.
Weitere Antworten (2)
Edric Ellis
am 2 Feb. 2017
For the gpuArray case, you could simply use save and load, i.e.
tempFile = tempname();
save(tempFile);
reset(gpuDevice);
load(tempFile);
delete(tempFile);
2 Kommentare
Walter Roberson
am 3 Feb. 2017
I did some poking around and thought I was getting somewhere but it didn't work. I was looking for a way to get at the workspace of the current function, with the idea that altering the workspace would be equivalent to altering the variable. I found that if you declare a nested function and use functions() that you get a workspace of the nested function that includes all variables in the parent assigned at the point you took the handle, which seemed like a doable way of getting access to your own workspace. Unfortunately changing the workspace did not change the variables in the function even for the shared variables. I was not able to get further on this.
It did leave me wondering if it would work for moving values in and out of the GPU array. If you have a shared variable that is assigned a gpu array and you gather it and send it again, then does that affect the original gpu array? The gather is going bring it back clearly, but the rewrite might instead create a second variable. I consider evalin('caller') to be a form of eval() though others might disagree I guess.
Joss Knight
am 5 Feb. 2017
Bearbeitet: Joss Knight
am 5 Feb. 2017
Well, if you're really serious about a tool for managing storage of GPU arrays, then you need a new class. This would be a numeric handle type that forwards all its functions to the underlying type, and adds all new objects to a static list. All functions run in a try...catch statement to catch parallel:gpu:array:OOM and, if triggered it calls a static utility function to gather the contents of the list back to the host and try again.
The only difficulty here is that you need to provide an implementation of every single method you want your new type to implement, i.e. every method of gpuArray (and a few more that aren't methods of gpuArray but are functions that can take gpuArray inputs). But that code could be autogenerated fairly easily.
2 Kommentare
Joss Knight
am 17 Feb. 2017
Bearbeitet: Joss Knight
am 17 Feb. 2017
It's just a boiler-plate method for any function, so, say, for plus:
function varargout = plus(varargin)
% This bit swaps out the custom-type arguments for
% their underlying gpuArray property
for i = 1:numel(varargin)
if (isa(varargin{i}, 'MyManagedGPUArrayType')
varargin{i} = varargin{i}.UnderlyingArrayProperty;
end
end
% Try at least twice
for i = 1:2
try
[varargout{1:nargout}] = plus(varargin{:});
catch me
if i == 2 || me.identifier ~= "parallel:gpu:array:OOM"
rethrow(me);
else
MyManagedGPUArrayType.doSomeGatheringToClawBackMemory();
continue;
end
end
break;
end
end
So you create some script that reads a long list of function and creates a file with all these forwarding methods in, substituting in the name of the function. Well, no, you'd create a utility function for most of this call-gather-call structure and have a much simpler repeated boiler-plate for each method.
Siehe auch
Kategorien
Mehr zu Performance and Memory finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!