trans_setup_2=ASSIGN {
# Blower speed in rpm
variable = SPEED
value = 1234
}
ASSIGN {
# Resulting time increment
variable = TIME_INCREMENT
value = 60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE
}
trans_setup_2 is a cell array created from the text file containing the above text.
I would like to extract an index of ASSIGN { # Blower speed in rpm (only first occurence ) and along with an index of string value = 60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE .
I have tried following:
ex='(?=ASSIGN).*|(?<=value = 60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_B).*'
ones=~cellfun(@isempty,regexp(trans_setup_2, ex, 'match'))
How should adapt 'ex' search pattern extactly to get only first occurence of the ASSIGN and then value = 60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE.
Please help me on this.

7 Kommentare

Stephen23
Stephen23 am 11 Sep. 2021
@Nit C: please upload the original textfile by clicking the paperclip button.
Walter Roberson
Walter Roberson am 11 Sep. 2021
What output are you hoping for with respect to this extract of the file ?
Nit C
Nit C am 13 Sep. 2021
Bearbeitet: Nit C am 13 Sep. 2021
@Stephen @Walter Roberson. Sorry for the late reply. Thanks for taking time to answer my question.
Attached sample text file (Actual file is 2000++ lines long) where i should look into and extract the index of some text, words occuring inside so that i can copy the text from one index upto the target index and insert into another input text file.
e.g. i wouild like to first occurence of : 'ASSIGN' along with ' # Blower ...
ASSIGN {
# Blower speed in rpm
in my main text file ASSIGN word occurs more 10 times. So for me interesting to get one an first index ASSIGN and then an end index from line value = 60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE, so that i get range of lines to copy into aonther text file.
Similarly i would like find index of following text pattern occurs in my file. e.g.
text pattern to search
MESH_MOTION( "wheel-_3D_of_fluid-wheel" ) {
type = rotation
}
and
another text pattern to search
SIMPLE_BOUNDARY_CONDITION( "wall__fluid-wheel" ) {
shape = three_node_triangle
element_set = "fluid-wheel-_3D_of_fluid-wheel"
my problem is, i have multiple rows cell array to look for the pattern. I have tried many combinations of regular expressions to build the pattern but not getting extact text.
Thanks (sorry for my English)
Consider the extract you posted above,
trans_setup_2=ASSIGN {
# Blower speed in rpm
variable = SPEED
value = 1234
}
ASSIGN {
# Resulting time increment
variable = TIME_INCREMENT
value = 60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE
}
exactly what output would you like from this? When you talk about "index", do you mean that trans_setup_2 is a character vector, and you want to know the value J such that trans_setup_2(J) is the start of the '1' character of the '1234', and that you want the value K such that trans_setup_2(K) is the start of the '6' character of the '60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE' ? Or do you need the start and end indices, like J1, J2 such that trans_setup_2(J1:J2) = '1234' and K1, K2 such that trans_setup_2(K1:K2) = '60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE' ?
Or do you want the text '1234' and '60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE' extracted and you do not care about the indices into trans_setup2 that they occur at?
In the cases where the extracted text is a valid number, do you want the saved value automatically converted to double precision?
Do you need as output a table,
SPEED TIME_INCREMENT
1234 '60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE'
Do you need a struct,
struct('SPEED', 1234, 'TIME_INCREMENT', '60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE')
do you need something else?
"I want as output, the following variables: X. X should be a struct array, one entry per block. Each entry should have a field named 'variable' that should be a categorical, and a field named 'value' that should be ..."
Nit C
Nit C am 13 Sep. 2021
trans_setup_path=fullfile('D:\timpts' ,'\trans_setup_2.txt');
trans_setup = fopen(trans_setup_path,'r');
trans_setup_2=textread(trans_setup_path,'%s', 'delimiter','\n','whitespace','');
for i=1:length(trans_setup_2)
#fprintf(char(trans_setup_2(i)));
lines=char(trans_setup_2(i));
end
ex='(?=ASSIGN).*|(?=value = 60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_B).*'
ones=~cellfun(@isempty,regexp(trans_setup_2, ex, 'match'))
here code i have tried. Where 'trans_setup_2' is an cell array , which is the litterally a text file read as cell array to capature line number (i mean index of one from 'ones')
By an index means a line number. Final target is to desired lines from the text file , with particual text pattern to copy ranges of lines into another file.
trans_setup_path = fullfile('D:\timpts' ,'trans_setup_2.txt');
S = fileread(trans_setup_path);
S = regexp(S, '^ASSIGN\s', 'split', 'lineanchors');
S = regexprep(S, '^{', 'ASSIGN {');
Now S should be a cell array of character vectors. The first one should start with
trans_setup_2=ASSIGN {
and the others should start with
ASSIGN {
and each of them should be an exact copy of a {} block of text.
You probably do not need to know the line numbers to copy: you have the blocks of text right there, so you can copy out of the blocks.
You can parse each block,
vals = regexp(S, 'variable = (?<variable>\S+).*value = (?<value>[^\r\n]+)', 'names');
and that should get you a struct array with fields 'variable' and 'value' . You can search those for the variable names you are looking for to determine whether you are interested in copying the block or not.
Copying the block is
number_of_blocks_written = 0;
stuff
if number_of_blocks_written > 0
fprintf(outfid, '\n');
end
fwrite(outfid, S{K});
number_of_blocks_written = number_of_blocks_written + 1;
The care about writing \n or not is to avoid writing extra newlines. A newline has probably been eaten by the the process of finding the lines beginning with ASSIGN.
Nit C
Nit C am 15 Sep. 2021
@Walter Roberson, Thanks. This solved my problem.

Melden Sie sich an, um zu kommentieren.

 Akzeptierte Antwort

Mathieu NOE
Mathieu NOE am 13 Sep. 2021

1 Stimme

hello
my 2 cents suggestion using readlines and working on strings :
this simple code can be expanded / modified according to what you need.
rr = readlines('trans_setup_2.txt');
rr_strip = strip(rr,'left'); % remove left blanks
a = find(strcmp(rr_strip,'ASSIGN {'));
b = find(strcmp(rr_strip,'value = 60 / SPEED / NR_BLADES / NR_TIME_STEPS_PER_BLADE'));
text_extract1 = rr(a(1):b);

1 Kommentar

Nit C
Nit C am 13 Sep. 2021
I had strcmp used. But i am intersted to go with 'regexp' becuase there are many selection of text to make based on general pattern, keywords instead of extact text.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Characters and Strings finden Sie in Hilfe-Center und File Exchange

Produkte

Version

R2020a

Gefragt:

am 10 Sep. 2021

Kommentiert:

am 15 Sep. 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by