performance - Matlab: Speed-up reading of ascii file -

March 15, 2013

i wrote piece of code works fine, way slow purposes:

%%% load nodal data %%%  path = sprintf('%sfile.dat',directory); fid = fopen(path);  num_nodes = textscan(fid,'%s %s %s %s %d',1,'delimiter', ' '); num_nodes = num_nodes{5}; header = textscan(fid,'%s',7,'delimiter', '\t');  k = 0; while ~feof(fid)      line        = fgetl(fid);     [head,rem]  = strtok(line,[' ',char(9)]);       if head == '#'         k = k+1;         j = 1;         time_steps(k)  = sscanf(rem, [' output @ t = %d']);             end      if ~isempty(head)         if head ~= '#'             data(j,:,k)  = str2num([head rem]);              j = j+1;         end     end  end fclose(fid);  nodal_data = struct('header',header,'num_nodes',num_nodes,'time_steps',time_steps,'data',data);

the ascii reading matlab looks this:

# number of nodes: 120453 #x                  y                   z                   depth               vel_x               vel_y               wse              # output @ t = 0        76456.003              184726             3815.75                   0                   0                   0             3815.75        76636.003              184726             3728.25                   0                   0                   0             3728.25        76816.003              184726                3627                   0                   0                   0                3627        76996.003              184726             3527.75                   0                   0                   0             3527.75        77176.003              184726              3371.5                   0                   0                   0              3371.5 # output @ t = 36000.788        76456.003              184726             3815.75                   0                   0                   0             3815.75        76636.003              184726             3728.25                   0                   0                   0             3728.25        76816.003              184726                3627                   0                   0                   0                3627        76996.003              184726             3527.75                   0                   0                   0             3527.75        77176.003              184726              3371.5                   0                   0                   0              3371.5

while code wrote works files small, blows on me larger ascii files. had abort loading ~25mb ascii (approximately 240k lines), test file. later versions of file ~500mb. there way of speeding process of loading file not happy 3 if-statements, did not know how seperate '#' numbers switch on head, because not able distinguish 'head' class, i.e. trying check either ischar or isnumeric, variable 'head' read string, case of ischar , never never isnumeric = true. not happy using tokenizer @ being able use if-cases , putting line here: str2num([head rem]);, consumes lot of time. however, did not know how else it. if have useful suggestions of how adapt code, highly appreciate them!

have sunday , thank in advance!

the code below reads approx 70000 timesteps 5 nodes per step in around 7 seconds. of code , should easy enough add features of code. there other ways of doing faster should adequate.

filename = 'd:\temp\input.txt';  filetext = fileread(filename); headerlines = 2; valuesperline = 7; expr = '[^\n]*[^\n]*'; lines = regexp(filetext, expr, 'match'); istimestep = cellfun(@(x) strncmp(x,'#',1), lines ); numtimesteps = sum(istimestep)-headerlines; nodesperstep = ((length(lines)-headerlines) / numtimesteps ) - 1; data = zeros(nodesperstep, valuesperline, numtimesteps);  timestep = 1:numtimesteps     lineindex = headerlines + (timestep-1) * (nodesperstep + 1) + 2;     node = 1:nodesperstep         data(node, :, timestep ) = sscanf(lines{lineindex},'%f');         lineindex = lineindex + 1;     end     end

just tried on 2 million line file (340000 time steps 5 nodes per step) , took approx 36 seconds run.

if want solution doesn't have coded loops, replace code from

data = zeros(....

with

values = cellfun(@(x) sscanf(x,'%f'),lines(~istimestep),'uniformoutput',false); data = reshape(cell2mat(values), nodesperstep, valuesperline, numtimesteps);

but takes 50% longer run.

Search This Blog

Three

performance - Matlab: Speed-up reading of ascii file -

Comments

Post a Comment

Popular posts from this blog

.htaccess - First slash is removed after domain when entering a webpage in the browser -

Socket.connect doesn't throw exception in Android -

SPSS keyboard combination alters encoding -